A small concern in the above approach is that videos of similar quality with fewer views tend to naturally have high ratios of likes to dislikes (or likes to views). The videos with fewer views tend to be watched by subscribers/aficionados/lovers. As the video gains popularity, it gets exposed to a wider audience, which typically likes it less. Then the ratio goes down. The primary goal is to find the best of all videos. Then we can for each number of views check how many videos have a higher ratio of, e.g. likes to views and only provide the results in top x% of the ratio. A uniform sample of videos across all range of views would be readily obtained by selecting videos in top x% (of some ratio) without specifying the target number of views. That is precisely the idea behind Smart Video Filter.
Though the idea appears simple, the implementation took many evenings over a long period of time. The project is quite rich in required expertise: UX design, architecture, back-end engineering, front-end engineering, and devops skills. The technologies include NoSQL databases MongoDB and Elasticsearch, front-end technologies Angular 5 and Angular Material from Google, CentOS 7 administration and cluster administration. The project went through several stages. The initial UI was written in AngularJS and then rewritten in Angular 5. The initial backend datastore included PostgreSQL, which was unable to support the required I/O loads and was switched to MongoDB. Initially the project used Elasticsearch 2.1 and then gradually migrated up to Elasticsearch 6.2. Retrieval and refresh of videos and channels metadata need to rely on non-trivial algorithms to be efficient and not use up a daily quota before noon. All that relies on heavily multi-threaded and fault-tolerant Java backend. The underlying cluster system implements high availability, when after a complete loss of one machine all systems still work. YouTube terms of service and developer policies are quite strict and took a while to comply with. Recently, I got an audit review from YouTube compliance team, and well, they aren't shutting me down yet.
I'm greatly enjoying the final implementation myself, while searching for the best videos and filtering over duration in a fine-grained way. My long-term hope came true! Even now I got heavily distracted using the service instead of writing the post. 46:1 ratio video of Donald Trump "singing" Shake it Off by Taylor Swift is an amazing composition! The project has obvious ways to improve to substitute the YouTube original search even more: provide recommendations and play videos on the site without redirecting to YouTube. However, Smart Video Filter is a "small market share" application. If the "market maker" YouTube itself was to implement it, then lots of videos would not be regularly shown in search results/recommendations, which would have discouraged the content creators. Hope you enjoy this niche service as I enjoy it myself!