Python to Import Your Dumped Tweets to MongoDb

I’m playing around with my tweets. Just so you’re aware that you could download your entire tweets and play around with it. The format is JSON so I think it makes perfect sense to dump this to MongoDB. But you can’t just import that straight away it needs some manipulation. I’m not good at Python so the code here might be tedious for python dudes. I’m going to use this data in my analysis, which will be captured in the next blog post.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import pymongo
from pprint import pprint

path = './data'
client = pymongo.MongoClient('localhost', 27017)
db = client.tweets
def main():
        for infile in glob.glob( os.path.join(path, "*.js")):
                content = open(infile).read()
                indexOfFirstEqualSign = content.find("=") + 1
                pureJson = content[indexOfFirstEqualSign:]
                jsonifedData = json.loads(pureJson)
                db.tweets_collection.insert(jsonifedData)

if __name__ == "__main__":
        main()

The data is kept in data folder.

Sep 4th, 2013

Comments