The HNSearch API enables developers to access HN data programatically via simple HTTP requests. This documentation describes how to request data from the API and how to interpret the response.
If you have any questions about the API you can chat with us at our Developer Forum or send us an email (hnsearch@thriftdb.com).
The HNSearch webapp is a simple javascript app that sends queries directly to the api.hnsearch.com bucket at ThriftDB. Since the hnsearch bucket is public, the ThriftDB REST API will act as the HNSearch API itself.
All requests to the HNSearch API are simple HTTP GET requests. Data collections can be accessed at:
http://api.thriftdb.com/api.hnsearch.com/<collection>
Currently, all responses are returned in JSON format.
id - The item's unique integer id (not searchable)parent_id - The parent comment's idparent_sigid - The parent comment's signed idpoints - The number of pointsusername - The submitter's usernametype - Item type (submission|comment)url - A submission urldomain - A submission url's domain nametitle - A submission titlenum_comments - Number of submission commentstext - The submission/comment contentdiscussion{} - A comment's parent discussion
id - The discussion's item id (not searchable)sigid - The discussion's signed idtitle - The discussion's item title (not searchable)create_ts - When the item was createdcache_ts - When the item was last cachedusername - The user's unique nameabout - The user's biokarma - The number of points a user hascreate_ts - When the user was createdcache_ts - When the user object was last cachedThe HNSearch API returns appropriate HTTP status codes for API requests. In particular it returns 503 (Service Unavailable) status code when the service is down for maintenance. Your application should handle server errors gracefully in case there is planned downtime or an unexpected failure.
For a full list of ThriftDB REST API methods and arguments please see the ThriftDB REST API documentation.
For a full explanation of the ThriftDB Search API please see the ThriftDB Search API documentation. The following is a summary of the arguments and response objects for the search endpoint:
http://api.thriftdb.com/api.hnsearch.com/<collection>/_search
| Argument | Datatype | Default |
|---|---|---|
q |
string | none |
start |
integer | 0 |
limit |
integer | 10 |
sortby |
string | 'score desc' |
filter[fields][<fieldname>][] |
string | none |
filter[queries][] |
string | none |
facet[fields][<fieldname>][include] |
boolean | false |
facet[fields][<fieldname>][exclude_filter] |
boolean | true |
facet[fields][<fieldname>][start] |
integer | 0 |
facet[fields][<fieldname>][limit] |
integer | 10 |
facet[queries][] |
string | none |
weights[<fieldname>] |
float | 1.0 |
boosts[fields][<fieldname>] |
float | none |
boosts[filters][<filterstring>] |
float | none |
boosts[functions][<functionstring>] |
float | none |
highlight[markup_items] |
boolean | false |
highlight[include_matches] |
boolean | false |
hits - The total number of matched itemstime - The amount of time it took to process the search requestrequest{} - The request parameters used by the server
qstartlimitsortbyfilter{}
fields{<fieldname>:<fieldvalues>[]}queries[]facet{}
fields{}
<fieldname>{}
includestartlimitexclude_filterqueries[]weights{<fieldname>:<weightvalue>}boosts{}
fields{<fieldname>:<boostfactor>}filters{<filterstring>:<boostfactor>}functions{<functionstring>:<boostfactor>}highlight{}
markup_itemsinclude_matchesresults[]{} - Sorted list of matched items and highlights
item{} - A matched itemscorefacet_results{}
fields
<fieldname>{}
facets[]{}
value - The value of the facetcount - The number of matched itemsqueries{<queryvalue>:<querycount>}ThriftDB lets you use field weights and numeric attributes to influence an item's match score. It also lets you boost by more complicated mathematical functions.
The webapp at hnsearch.com uses this combination of field weights, field boosts, and the Hacker News hotness algorithm to rank results:
"weights": {
"title" : 1.1,
"text" : 0.7,
"url" : 1.0,
"domain" : 2.0,
"username": 0.1,
"type" : 0.0
},
"boosts": {
"fields": {
"points" : 0.15,
"num_comments": 0.15
},
"functions": {
"pow(2,div(div(ms(create_ts,NOW),3600000),72))": 200.0
}
}
Here's an example search url:
http://api.thriftdb.com/api.hnsearch.com/items/_search?q=facebook&weights[title]=1.1...
Developers are encouraged to use their own ranking algorithms to rank results. For a more thorough explanation of how match scores are calculated, developers can consult the Lucene scoring documentation.