Unfortunately, historical access traditionally costs a great deal of money. The Firehose stream costs millions, so companies are hesitant to just give it all away, ha. Most of the time you need to go through Gnip, DataSift, or Topsy. Topsy at least has a public search API, for example: http://topsy.com/s?q=%23occupyboston
You can also try using some trickery, like searching within the Twitter site on Google, for example: https://www.google.com/#q=site:twitter.com+%22%23occupyboston%22
And while that somewhat works, you will still need a way to scrape those web results before getting into any analysis. I'm thinking Beautiful Soup for Python might be handy here.
Anyway, all that to say I'm not aware of any API that will return historical tweets in JSON format. It's the main reason people do personal setups and store the 1% stream in a database on their own machine - makes historical searches a lot easier, ha.
DT
--------------------------------------------
On Tue, 1/13/15, LB Byers <[log in to unmask]> wrote:
Subject: Re: GIS data from Instagram, Twitter, etc.
To: "D T" <[log in to unmask]>
Date: Tuesday, January 13, 2015, 9:16 AM
Thanks for the info about json
objects. I will read up about it and see if it makes sense
to go that route. Is it possible to export this type of data
for old tweets and instagram posts, say from 4 months ago
rather than from streams to csv or shp?
On Mon, Jan 12, 2015 at
11:12 PM, D T <[log in to unmask]>
wrote:
As Chris
said, the public API's return a JSON object, so it's
fairly easy to make it into geoJSON. ArcGIS and geoJSON
are kind of weird, because ESRI in their infinite wisdom
decided that there should be an ESRI JSON format, which is
different than geoJSON. Thankfully, ESRI also makes a
converter, not to mention there is always GDAL for
converting from geoJSON to shp. As a side note, you can
load geoJSON directly into QGIS.
Anyway, I went ahead and uploaded a couple of Python script
examples I've used in the past to collect tweets from
the 1% Twitter public stream. You can grab API keys from
Twitter and plug them into any of the scripts to get running
right away. After you've got a json or txt file, you
can parse out which tweets have coordinates. Basically get
rid of any tweets that have "geo":null, since they
don't have any coordinates.
https://github.com/dmofot/twitterstream
Hope that at least helps you get started.
DT
-------------------------------------------------------------------------
This list (NEARC-L) is an unmoderated discussion list for
all NEARC Users.
If you no longer wish to receive e-mail from this list, you
can remove yourself by going to http://listserv.uconn.edu/nearc-l.html.
------------------------------------------------------------------------- This list (NEARC-L) is an unmoderated discussion list for all NEARC Users.
If you no longer wish to receive e-mail from this list, you can remove yourself by going to http://listserv.uconn.edu/nearc-l.html.
|