It's complicated. I have a task management system I wrote for myself that I basically live on. I realized it's good for handling lists of things and sparse ranking data (what it was built for). Thanks to people's suggestions I have about 500 movies that they've asked me to put in the rotation.
So what I do is I have a function called pull(n) that pulls n number of items randomly statistically more aligned with the highest ranked things. When an item has zero ranking data it's technically in the highest cohort. So that's basically equal to pulling randomly some number of movies that haven't hit a poll yet. I then use the poll results to not just pick the movie but also add ranking data to my system.
Still, because there are so many that haven't hit a poll yet it would be a long time before any of that polling data ever helped improve the selection. That is if I hadn't made a function called pullexposed(n) that filters down to only items that have at least some ranking data attached to them. So I add like 4 items from that to a larger list so there are a few recognized favorites in the poll.
I've been experimenting with making the polls so large in part to get through the unexposed items faster. Also because we genuinely seem to get better movies when I do that.
But I keep adding movies and now double features. It really is insane to do something that forces you to gain an intuition for just how many movies exist. When I realize how many obvious movies are still missing from the list, a lot, while I'm at around 500, it's crazy how many movies not only exist, but that people actually like and are culturally significant. There's not only more movies than we could ever watch. There are more movies than I can use over many many polls.
That's a crazy amount of complexity for movie nights! Did you make the extra functions just for fun? Would making them truly random not yield some better results?
Well it's really a crazy amount of complexity for a task management system. I actually did add the pullexposed function because of movie nights but I now use it inside my task management so it's really just increasing build out of a system I use every day.
It's complicated. I have a task management system I wrote for myself that I basically live on. I realized it's good for handling lists of things and sparse ranking data (what it was built for). Thanks to people's suggestions I have about 500 movies that they've asked me to put in the rotation.
So what I do is I have a function called pull(n) that pulls n number of items randomly statistically more aligned with the highest ranked things. When an item has zero ranking data it's technically in the highest cohort. So that's basically equal to pulling randomly some number of movies that haven't hit a poll yet. I then use the poll results to not just pick the movie but also add ranking data to my system.
Still, because there are so many that haven't hit a poll yet it would be a long time before any of that polling data ever helped improve the selection. That is if I hadn't made a function called pullexposed(n) that filters down to only items that have at least some ranking data attached to them. So I add like 4 items from that to a larger list so there are a few recognized favorites in the poll.
I've been experimenting with making the polls so large in part to get through the unexposed items faster. Also because we genuinely seem to get better movies when I do that.
But I keep adding movies and now double features. It really is insane to do something that forces you to gain an intuition for just how many movies exist. When I realize how many obvious movies are still missing from the list, a lot, while I'm at around 500, it's crazy how many movies not only exist, but that people actually like and are culturally significant. There's not only more movies than we could ever watch. There are more movies than I can use over many many polls.