LISTSERV 16.5 - CODE4LIB Archives

Hi Eric,

Like Francis and Darnelle said, Twitter's primary free search API is limited to the last 7 days of activity. The so called "Standard" search API is what twarc uses to gather data when you `twarc search …`

However a couple years ago Twitter added the Premium Search API [1] which is a hybrid approach that lets you search two endpoints (30 day and full archive), and is engineered to move you from collecting data for free to paying Twitter as you (inevitably) want to gather more.

From your email it sounds like you want to use the Full Archive endpoint? We have had this on the Documenting the Now roadmap to add premium support to twarc but haven't quite got around to it yet.

I went ahead and created a GitHub issue for you to track our progress [2]. It actually shouldn't be too difficult to add, so if you have a present need let us know so we can prioritize it higher.

//Ed

PS. As Francis mentioned twint gets around Twitter's API constraints by scraping Twitter's search results web page. Scraping comes with its own set of complexities, the biggest one is that Twitter actively work to prevent it, which (in my experience) can make twint a bit unpredictable to use at times.

[1] https://developer.twitter.com/en/docs/tweets/search/overview/premium
[2] https://github.com/DocNow/twarc/issues/326