Rank of post in Reddit is based on Up-Votes and Down-Votes and Age of the post. As per the Reddit Source Code on Github, The ranking algorithm was updated on January 12, 2014. This algorithm keeps interesting  articles in front page. In this article I am going to explain how to implement this algorithm with SQL

Reddit Ranking Algorithm

cpdef double _hot(long ups, long downs, double date):
    """The hot formula. Should match the equivalent function in postgres."""
    s = score(ups, downs)
    order = log10(max(abs(s), 1))
    if s > 0:
        sign = 1
    elif s < 0:
        sign = -1
    else:
        sign = 0
    seconds = date - 1134028003
    return round(order + sign * seconds / 45000, 7)
score = upvotes - downvotes
date = Age of the post (Submission Time)
Simplified math equation of the algorithm look like this. 
f(score,age) = log10(score) + (sign)(age / 45000)
sign = sign value is based on score. if score is greater than zero then sign is +1 else sign is -1 

Database Design

We need minimum 2 tables to implement this algorithm.
reddit_post - To maintain rank and age
reddit_vote - To keep track of user votes for posts

Create reddit_post table

create table reddit_post (
 postid varchar(20),
 rank double,
 ctime TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
 primary key(postid)
)
postid - Unique ID of the post
rank - Rank of the post
ctime - Age of the post

Create reddit_vote table

create table reddit_vote (
 postid varchar(20),
 user varchar(20),
 vflag integer,
 ctime TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
 primary key(postid,user,vflag),
 foreign key(postid) references reddit_post(postid)
)
postid - Unique ID of the post
user - User ID
vflag - vote flag ( +1 for up-vote and -1 for down-vote)

Insert into reddit_post

insert into reddit_post(postid, rank) values('post1',3.1029353) 

Insert into reddit_vote

If user up-voted the post then insert 1 for vflag
If user down-voted the post then insert -1 for vflag
insert into reddit_vote(postid, user, vflag) values('post1','srinivas',1) 
insert into reddit_vote(postid, user, vflag) values('post1','srinivas',-1) 

SQL query to Get age

select ctime from reddit_vote where postid = 'post1'

SQL query to Get score

select sum(vflag) from reddit_vote where postid = 'post1'

Calculate rank

  1. Find age of the post by using above SQL query
  2. Find score of the post by using above SQL query
  3. Calculate sign ( If score is greater than 0 then sign is +1 else sign is -1)
  4. Now calculate the rank of the post  f(score,age) = log10(score) + (sign)(age / 45000)

0 comments:

Search This Blog

Loading...

Blogroll

Srinivas Dasari
find me on facebook
follow me on twitter

Follow this blog by Email

Popular Posts