Last week I wrote a
on how to build a basic database from a
.csv files using SQLite, and today I finally
got around to writing this short post on what to do next (i.e., how do your data out of
the database and into R).
This task requires the R packages DBI and RSQLite (use
to install). After opening R you must load these libraries:
You must connect R to the SQLite database file using
dbConnect(). The extension can be
.sqlite, whichever you named the database file when you created it via
SQLite. Note - for this to work you must either set R’s working directory to the
directory where your database lives or provide the path to the database.
con = dbConnect(SQLite(), dbname="mydatabase.db")
Next, you build your query (i.e., describe what you want to grab from the SQLite database). In the query below, we will be grabbing the entire column named “strain” from the table named “behaviour” from the database we just connected to.
myQuery <- dbSendQuery(con, "SELECT strain FROM behaviour")
After building your query, you can use that query to fetch the data from the database and
assign that to an R object. The
n = -1 argument in
dbFetch() must be used if you want
to retrieve all records in the database, otherwise the default is to stop at 500 records.
my_data <- dbFetch(myQuery, n = -1)
Now that the data is assigned the data to an object in R, it is advisable to clear the query. With very large datasets, this can be import for avoiding the exhaustion of resources (memory, file descriptors, etc.).
And that’s it! Now a subset of the data you want is in an R object. When you are done analyzing that aspect of the data, you can remove that R object and load the next subset of the database you need to analyze.