We have a csv with Pokemon data in it that we'll be using to build up our database schema, so the first step is to get the csv file into Rust structs.
Download the csv from GitHub and put it in crates/upload-pokemon-data/pokemon.csv
.
Note that the csv data has stayed the same since the first version of this workshop, so if you care: many modern pokemon are missing from the dataset.
We're going to add the csv
crate to our upload-pokemon-data
package using cargo add
, which is a command built-in to cargo these days. We can specify which package we want to add csv
to using the -p
flag.
cargo add -p upload-pokemon-data csv
then we can replace our main
function with an implementation that reads the csv in and processes it.
fn main() -> Result<(), csv::Error> {
let mut rdr = csv::Reader::from_path(
"./crates/upload-pokemon-data/pokemon.csv",
)?;
for result in rdr.records() {
let record = result?;
dbg!(record);
}
Ok(())
}
We'll start off by changing the type signature of main
to return a Result<(), csv::Error>
. The only errors we'll encounter in this lesson are csv::Error
s, and the main function has to return unit, ()
, if successful: Ok(())
.
Adding the Result type to our main
function allows us to use the ?
on Result
types returned from the csv crate functions to handle any errors.
For example, csv::Reader::from_path
accepts a filepath to the csv file we want to read. We've hardcoded this file path, so note you'll have to run this binary from the root of the project.
let mut rdr = csv::Reader::from_path(
"./crates/upload-pokemon-data/pokemon.csv",
)?;
The from_path
method returns a Result<Reader<File>, csv::Error>
. Since the main
function return type is the same error type, we can use ?
on the Result<Reader<File>, csv::Error>
to turn it into a Reader<File>
. If the value was an error, it will get returned to the main function immediately and the program will end.
Which makes rdr
a Reader<File>
. We let Rust know we're going to need exclusive access to the reader so we can mutate it by using the mut
keyword. Rust will use this information to make sure we aren't mutating the Reader<File>
from multiple locations at once, which would result in confusing and hard-to-debug bugs.
The Reader<File>
type is the csv::Reader
struct from the csv crate. This type implements a different, similarly named trait called Read
from the standard library: std::io::Read
whenever the inner type (File
) also implements Read
. Types that implement Read
are usually called "readers", so this is appropriate.
A reader allows us to read bytes from... somewhere. In this case it's a file, but it could also be a tcp socket or something even more different. readers allow us to build complex functionality on top of them in case we need to perform actions on truly massive files that don't live in memory, or files that don't fully exist yet.
Our usage is pretty regular by comparison, we could easily read the entire csv into a string without issue considering how small our csv is.
Here the csv::Reader
includes a records
function that makes it so we can loop over each row nicely though, so we'll use that.
for result in rdr.records() {
let record = result?;
dbg!(record);
}
records
returns a type that implements the Iterator
trait, so we can use it in a for loop to access each row.
The item type for that Iterator
is a StringRecord
struct with the values for each row. Here's an example:
StringRecord(["Bulbasaur", "1", "Overgrow, Chlorophyll", "Grass, Poison", "45", "49", "49", "65", "65", "45", "7", "69", "1", "0.125", "False", "False", "True", "False", "64", "45", "Monster, Plant", "70", "", "green", "15.0", "1.0", "2.0", "0.5", "0.5", "0.25", "2.0", "0.5", "1.0", "1.0", "2.0", "2.0", "1.0", "1.0", "1.0", "1.0", "1.0", "1.0", "0.5"])
It's possible that getting this StringRecord
could fail, so the result
variable is of type Result<StringRecord, csv::Error>
, which we can again handle with ?
. This effectively unwraps the Result
for us, or returns from main with the error if one exists.
So record
is a StringRecord
type and we print that out using the dbg!
macro.
The dbg!
macro is a bit like a fancy console.log
in JavaScript. It outputs the file location that the dbg!
macro was used, the expression we passed in, as well as the result of that expression.
[crates/upload-pokemon-data/src/main.rs:7] record = StringRecord(["Calyrex Shadow Rider", "898", "As One", "Psychic, Ghost", "100", "85", "80", "165", "100", "150", "24", "536", "8", "", "True", "True", "False", "True", "340", "3", "", "100", "", "green", "4.0", "0.0", "1.0", "1.0", "1.0", "1.0", "1.0", "0.0", "0.5", "1.0", "1.0", "0.5", "1.0", "1.0", "4.0", "1.0", "4.0", "1.0", "1.0"])
The dbg!
macro uses the Debug
trait implementation of the type we're trying to log out. In this case, that's a StringRecord
.
The Debug
trait is often used for debugging purposes, which is why we see the StringRecord
struct name in the console output when we run the program.
cargo run --bin upload-pokemon-data
Finally, if no errors have occurred we need to return Ok(())