We are not really active on twitter, basically just letting twitterfeed create tweets based on blog posts here, but...
if you keep up with things via twitter rather than a blog-reader, then you can get announcements of new blogposts here from our twitter feed.
www.twitter.com/rdbhost
Tuesday, February 16, 2010
Saturday, February 13, 2010
SQLAlchemy and the Stackoverflow Data Dump
We recently wrote an rdbhost module for SQLAlchemy (SQLA), allowing you to use Rdbhost databases through SQLAlchemy. It was mentioned in this blog about a month ago, here.
In the last couple of days I uploaded the February Stackoverflow data dump to our SO database here (account 767), and placed an announcement on meta.SO, including a mention of the DB API module. If you are an SQLAlchemy fan, and wish to explore the SO data via SQLAlchemy, this post will help get you started.
You do need SQLAlchemy installed, and it is installable from easy_install, or manually from a downloaded copy. You need the rdbhost-module, as well, placed in the databases folder within your SQLAlchemy install. SQLA will find it there.
Example:
Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)] on win32 IDLE 1.2.4 >>> from sqlalchemy import * >>> role = 'r0000000767' >>> authcode = '-' >>> dsn = 'rdbhost://'+role+':'+authcode+'@www.rdbhost.com' >>> db = create_engine(dsn) >>> meta = MetaData(db) >>> meta.reflect(schema='so') ... elided warnings from sqlalchemy about partial indexes ... >>> for t in meta.tables: .... print t so.posts so.badges so.tags so.users so.votes so.comments so.tagging >>> posts = meta.tables['so.posts'] >>> print posts.columns ['posts.id', 'posts.type', 'posts.parentid', 'posts.title', 'posts.creation', 'posts.owner', 'posts.accepted_answer', 'posts.score', 'posts.viewcount', 'posts.body', 'posts.lasteditor', 'posts.lasteditorname', 'posts.lasteditdate', 'posts.lastactivitydate', 'posts.communityowneddate', 'posts.closeddate', 'posts.tags', 'posts.answercount', 'posts.commentcount', 'posts.favoritecount'] >>> from sqlalchemy import select >>> users = meta.tables['so.users'] >>> s = select([users],users.c.reputation > 100000,limit=50) >>> recs = db.execute(s) >>> for user in recs: .... print user.name, user.reputation Marc Gravell 109683 Jon Skeet 134930 >>>
Friday, February 12, 2010
Bulk Data Transfer to/from Rdbhost
An upgrade to the server recently added the ability to upload and download data in bulk.
Downloading works very similarly to pg_dump, providing any of the three formats that pg_dump supports.
a) plaintext
b) tar file
c) compressed format
Uploading works like psql for plaintext, and pg_restore for the other two formats. In fact, the server feeds the input to those two utilities behind the scenes, so the results you get are exactly the same as if you were to use the utility directly. If you are uploading, drop in advance any existing tables that are in the upload, as the upload does not drop them for you.
The bulk transfer page is linked from the profile page, after logging in.
The upload streams the data to PostgreSQL, so processing starts when the upload starts; if you cancel the upload (by closing the browser window, for example) partway done, you will likely leave the database in an incomplete state. Remember to reclean the database before retrying the upload.
Downloading works very similarly to pg_dump, providing any of the three formats that pg_dump supports.
a) plaintext
b) tar file
c) compressed format
Uploading works like psql for plaintext, and pg_restore for the other two formats. In fact, the server feeds the input to those two utilities behind the scenes, so the results you get are exactly the same as if you were to use the utility directly. If you are uploading, drop in advance any existing tables that are in the upload, as the upload does not drop them for you.
The bulk transfer page is linked from the profile page, after logging in.
The upload streams the data to PostgreSQL, so processing starts when the upload starts; if you cancel the upload (by closing the browser window, for example) partway done, you will likely leave the database in an incomplete state. Remember to reclean the database before retrying the upload.
Subscribe to:
Posts (Atom)