UNIX Importing LARGE csv into SQLite

I have a 5gig csv file (also as a sas datafile, if it would be easier) which I need to put into a sql database so I can work with it in R.

The variables names are all contained in the first observation line and are double quoted. There are 1000+ variables some of numeric others character (though some of the character variables are strings of numerals, but I'm not too worried about it I can fix it in R).

My question is how can I import the csv file into a new table in my database with minimal pain?

I've found things saying to create your table first (which includes specifying all the variables, of which I have 1000+) and then using ".import file table" to bring in the data. Or, to use some gui import wizard, which is not an option for me.

Sorry if this is sql 101 but thanks for the help.

Answers


Here's my workflow:

library("RSQLite")
setwd("~/your/dir")
db <- dbConnect(SQLite(), dbname="your_db.sqlite") ## will make, if not present
field.types <- list(
        date="INTEGER",
        symbol="TEXT",
        permno="INTEGER",
        shrcd="INTEGER",
        prc="REAL",
        ret="REAL")
dbWriteTable(conn=db, name="your_table", value="your_file.csv", row.names=FALSE, header=TRUE, field.types=field.types)
dbGetQuery(db, "CREATE INDEX IF NOT EXISTS idx_your_table_date_sym ON crsp (date, symbol)")
dbDisconnect(db)

The field.types isn't necessary. RSQLite will guess from the header if you don't provide this list. The index isn't required either, but will speed up your queries later on (if you index the correct column for your queries).

I've been learning a lot of this stuff here on SO, so if you check my questions asked/answered on SQLite, you may find some tagential stuff.


Need Your Help

Keydown event for editable datagrid flash

flex events datagrid keydown

Hi All i have a flex editable datagrid. By default if i press the left navigation arrow key , it moves to the left but not to the next cell. I would like to override the keydown event such that if ...

How to automatically reload messages.properties files in Java/Spring?

java spring tomcat spring-mvc

I've been working on an interntaional website using Java/Spring using #springMessage() tags and message.properties files. See my recent question: In Java/Spring, how to gracefully handle missing

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.