So far we've been showing progress ourselves, but it would be nice to not shove a ton of output into the console, especially later when we can push more requests faster and writing too much to the console will actually slow us down!
We'll add a crate called indicatif which offers an extension trait called ProgressIterator
that allows us to attach a progress bar directly to an iterator.
cargo add -p upload-pokemon-data indicatif
Bring the trait into scope at the top of src/main.rs
.
use indicatif::ProgressIterator;
ProgressIterator
offers us five new methods on iterators:
- progress_with
- try_progress
- progress
- progress_count
- progress_with_style
Taking a look at these function signatures, we can skip progress_with
because it requires a ProgressBar
and we haven't created one yet.
try_progress
returns an Option
which indicates that something can fail. If we check the docs lower down we see that try_progress
uses the size_hint
method on Iterator
s to see if it should display a progress bar or not. We always want a progress bar, so we'll skip this because I happen to know size_hint
for our deserialize
won't be useful because we're streaming data in from a csv.
progress
requires our iterator to also implement ExactSizeIterator
, which because we're reading in csvs piece by piece we won't have.
Finally, progress_count
, which takes a length.
No matter what we do, if we want the count we're going to need the entire csv in memory.
We could use std::fs::read_to_string
to read the whole csv into memory, but we're already reading it in with the csv::Reader
. We can take advantage of that and .collect
to end up with a Result<Vec<PokemonCsv>, csv::Error>
.
let pokemon = rdr
.deserialize()
.collect::<Result<Vec<PokemonCsv>, csv::Error>>()
.into_diagnostic()?;
Using .collect
this way really shows off how flexible it can be when converting one container type into another. In this case deserialize
is giving us an iterator over individual Result
s. You can think of this as a Result<PokemonCsv, csv::Error>
for the purposes of this discussion. So what we have is like a Vec<Result<PokemonCsv, csv::Error>>
. That is: a list of results.
.collect
is able to transform that into into a Result
containing a list: Result<Vec<PokemonCsv, csv::Error>>
. It does this by taking the first error if there is one and returning that immediately. So we do lose some error information by doing this but we also gain the ability to stop earlier in the program if an error occurs.
With our new Vec
of pokemon
, we can iterate over that vec to do our inserts. We use into_iter
because it gives us ownership over the PokemonCsv
values, which we need for our From
instance to work. (if you want to see for yourself, change into_iter
to iter
here).
and the iterator we get from into_iter
is an ExactSizeIterator
because it's coming from a Vec
now, not a Reader
, so .progress
works.
for record in pokemon.into_iter().progress() {
let pokemon_row: PokemonTableRow = record.into();
insert_pokemon(&pool, &pokemon_row).await?;
}
I've also removed our previous println!
output, although there are ways to keep it in addition to the progress bar if we wanted to.
Now, I'm assuming you already ran the program to insert the data, so to run the program again we'll have to remove the data from the database.
In this case, it's easiest to drop the table from a mysql shell, which we can get from the pscale
cli. I've left all the formatting in the following commands so that you know where I'm running each one. pscale
is on our computer, drop table pokemon;
is in the mysql shell and will delete all of our data and the table itself.
Then we can re-run our create tables script to recreate the table for another run.
❯ pscale shell pokemon new-tables
pokemon/new-tables> drop table pokemon;
pokemon/new-tables> source crates/upload-pokemon-data/create-tables.sql
cargo run
:
DATABASE_URL=mysql://127.0.0.1 cargo run
will now show a progress bar!
█████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 105/1118