blob: f20eea88c2d99a1d1f27946e367a20852ebc4a83 [file] [log] [blame]
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]> <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]> <html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]--><head>
<meta charset='utf-8'/><meta http-equiv='X-UA-Compatible' content='IE=edge'/><meta name='viewport' content='width=device-width, initial-scale=1'/><meta name='keywords' content='groovy, cycling, duckdb, sql, ginq, gquery, csv'/><meta name='description' content='This post looks at processing the general classification results for the Critérium du Dauphiné using Groovy, GQuery and DuckDB.'/><title>The Apache Groovy programming language - Blogs - Processing Results for the Critérium du Dauphiné</title><link href='../img/favicon.ico' type='image/x-ico' rel='icon'/><link rel='stylesheet' type='text/css' href='../css/bootstrap.css'/><link rel='stylesheet' type='text/css' href='../css/font-awesome.min.css'/><link rel='stylesheet' type='text/css' href='../css/style.css'/><link rel='stylesheet' type='text/css' href='https://cdnjs.cloudflare.com/ajax/libs/prettify/r298/prettify.min.css'/>
</head><body>
<div id='fork-me'>
<a href='https://github.com/apache/groovy'>
<img style='position: fixed; top: 20px; right: -58px; border: 0; z-index: 100; transform: rotate(45deg);' src='/img/horizontal-github-ribbon.png'/>
</a>
</div><div id='st-container' class='st-container st-effect-9'>
<nav class='st-menu st-effect-9' id='menu-12'>
<h2 class='icon icon-lab'>Socialize</h2><ul>
<li>
<a href='https://groovy-lang.org/mailing-lists.html' class='icon'><span class='fa fa-envelope'></span> Discuss on the mailing-list</a>
</li><li>
<a href='https://twitter.com/ApacheGroovy' class='icon'><span class='fa fa-twitter'></span> Groovy on Twitter</a>
</li><li>
<a href='https://groovy-lang.org/events.html' class='icon'><span class='fa fa-calendar'></span> Events and conferences</a>
</li><li>
<a href='https://github.com/apache/groovy' class='icon'><span class='fa fa-github'></span> Source code on GitHub</a>
</li><li>
<a href='https://groovy-lang.org/reporting-issues.html' class='icon'><span class='fa fa-bug'></span> Report issues in Jira</a>
</li><li>
<a href='http://stackoverflow.com/questions/tagged/groovy' class='icon'><span class='fa fa-stack-overflow'></span> Stack Overflow questions</a>
</li><li>
<a href='http://groovycommunity.com/' class='icon'><span class='fa fa-slack'></span> Slack Community</a>
</li>
</ul>
</nav><div class='st-pusher'>
<div class='st-content'>
<div class='st-content-inner'>
<!--[if lt IE 7]>
<p class="browsehappy">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</p>
<![endif]--><div><div class='navbar navbar-default navbar-static-top' role='navigation'>
<div class='container'>
<div class='navbar-header'>
<button type='button' class='navbar-toggle' data-toggle='collapse' data-target='.navbar-collapse'>
<span class='sr-only'></span><span class='icon-bar'></span><span class='icon-bar'></span><span class='icon-bar'></span>
</button><a class='navbar-brand' href='../index.html'>
<i class='fa fa-star'></i> Apache Groovy
</a>
</div><div class='navbar-collapse collapse'>
<ul class='nav navbar-nav navbar-right'>
<li class=''><a href='https://groovy-lang.org/learn.html'>Learn</a></li><li class=''><a href='https://groovy-lang.org/documentation.html'>Documentation</a></li><li class=''><a href='/download.html'>Download</a></li><li class=''><a href='https://groovy-lang.org/support.html'>Support</a></li><li class=''><a href='/'>Contribute</a></li><li class=''><a href='https://groovy-lang.org/ecosystem.html'>Ecosystem</a></li><li class=''><a href='/blog'>Blog posts</a></li><li class=''><a href='https://groovy.apache.org/events.html'></a></li><li>
<a data-effect='st-effect-9' class='st-trigger' href='#'>Socialize</a>
</li><li class=''>
<a href='../search.html'>
<i class='fa fa-search'></i>
</a>
</li>
</ul>
</div>
</div>
</div><div id='content' class='page-1'><div class='row'><div class='row-fluid'><div class='col-lg-3'><ul class='nav-sidebar'><li><a href='./'>Blog index</a></li><li class='active'><a href='#doc'>Processing Results for the Critérium du Dauphiné</a></li><li><a href='#_results_file' class='anchor-link'>Results file</a></li></ul><br/><ul class='nav-sidebar'><li style='padding: 0.35em 0.625em; background-color: #eee'><span>Related posts</span></li><li><a href='./reading-and-writing-csv-files'>Reading and Writing CSV files with Groovy</a></li><li><a href='./parsing-json-with-groovy'>Parsing JSON with Groovy</a></li><li><a href='./groovy-list-processing-cheat-sheet'>Groovy List Processing Cheat Sheet</a></li><li><a href='./zipping-collections-with-groovy'>Zipping Collections with Groovy</a></li><li><a href='./comparators-and-sorting-in-groovy'>Comparators and Sorting in Groovy</a></li><li><a href='./working-with-sql-databases-with'>Working with SQL databases with Groovy and GraalVM</a></li></ul></div><div class='col-lg-8 col-lg-pull-0'><a name='doc'></a><h1>Processing Results for the Critérium du Dauphiné</h1><p><span>Author: <i>Paul King</i></span><br/><span>Published: 2023-06-13 05:00PM</span></p><hr/><div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>The 2023 <a href="https://www.criterium-du-dauphine.fr/en/">Critérium du Dauphiné</a>
has just finished. Let&#8217;s examine the results using Groovy and DuckDB.
For the purposes of this post, we are interested in the overall top ten
riders in the general classification.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_results_file">Results file</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Our results are stored in a CSV file:</p>
</div>
<div class="paragraph">
<p><span class="image"><img src="img/topten.png" alt="lines from CSV file"></span></p>
</div>
<div class="paragraph">
<p>In a
<a href="https://groovy.apache.org/blog/reading-and-writing-csv-files">previous article</a>,
we looked at reading and writing CSV files using a number of CSV libraries. Today we will use a nice feature of
<a href="https://duckdb.org/">DuckDB</a>
which can read in CSV files on the fly.</p>
</div>
<div class="paragraph">
<p>Our goal is very simple, just print out the information but grouped by each rider&#8217;s country.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="prettyprint highlight"><code data-lang="groovy">Sql.withInstance('jdbc:duckdb:', 'org.duckdb.DuckDBDriver') { sql -&gt;
println "Country Riders Places Count"
sql.eachRow("""SELECT Country,
rpad(string_agg(Rider, ', '), 40, ' ') as Riders,
rpad(string_agg(Place, ', '), 10, ' ') as Places,
bar(count(Country), 0, 4, 30) as Count
FROM 'topten.csv' GROUP BY Country""") { row -&gt;
row.with {
println "$Country $Riders$Places$Count"
}
}
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Here we are making use of several built-in functions from DuckDB
including <code>rpad</code> to right pad the output, <code>string_agg</code> to aggregate
the riders (and their places) from the same country, and <code>bar</code>
to produce a pretty barchart.</p>
</div>
<div class="paragraph">
<p>Running this code produces the following output:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="prettyprint highlight"><code data-lang="text">Country Riders Places Count
🇩🇰 JONAS VINGEGAARD 1 ███████▌
🇬🇧 ADAM YATES 2 ███████▌
🇦🇺 BEN O’CONNOR, JAI HINDLEY, JACK HAIG 3, 4, 5 ██████████████████████▌
🇫🇷 GUILLAUME MARTIN, JULIAN ALAPHILIPPE 6, 10 ███████████████
🇿🇦 LOUIS DU BOUISSON MEINTJES 7 ███████▌
🇳🇴 TORSTEIN TRÆEN 8 ███████▌
🇪🇸 CARLOS RODRIGUEZ CANO 9 ███████▌</code></pre>
</div>
</div>
<div class="paragraph">
<p>We can achieve a similar result using GQuery (AKA GINQ) using the following code:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="prettyprint highlight"><code data-lang="groovy">var f = 'topten.csv' as File
var lines = f.readLines()*.split(',')
var cols = lines[0].size()
var rows = lines[1..-1].collect{row -&gt;
(0..&lt;cols).collectEntries{ col -&gt; [lines[0][col], row[col]] }}
var commaDelimited = Collectors.joining(', ')
var aggRiders = { it.stream().map(rec -&gt; rec.r.Rider).collect(commaDelimited) }
var aggPlaces = { it.stream().map(rec -&gt; rec.r.Place).collect(commaDelimited) }
println GQ {
from r in rows
groupby r.Country
select r.Country,
agg(aggRiders(_g)) as Riders,
agg(aggPlaces(_g)) as Places,
'██' * count(r.Country) as Count
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>We could use a CSV library to read in the data, but for this simple example
we&#8217;ll just use Groovy&#8217;s line/text processing capabilities.
GQuery doesn&#8217;t currently have built in equivalents to <code>bar</code> or <code>string_agg</code>
so we roll our own crude bar character function and aggregators, <code>aggRiders</code>
and <code>aggPlaces</code>.</p>
</div>
<div class="paragraph">
<p>Running this code gives the following output:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="prettyprint highlight"><code data-lang="text">+---------+--------------------------------------+---------+--------+
| Country | Riders | Places | Count |
+---------+--------------------------------------+---------+--------+
| 🇦🇺 | BEN O’CONNOR, JAI HINDLEY, JACK HAIG | 3, 4, 5 | ██████ |
| 🇩🇰 | JONAS VINGEGAARD | 1 | ██ |
| 🇳🇴 | TORSTEIN TRÆEN | 8 | ██ |
| 🇿🇦 | LOUIS DU BOUISSON MEINTJES | 7 | ██ |
| 🇬🇧 | ADAM YATES | 2 | ██ |
| 🇪🇸 | CARLOS RODRIGUEZ CANO | 9 | ██ |
| 🇫🇷 | GUILLAUME MARTIN, JULIAN ALAPHILIPPE | 6, 10 | ████ |
+---------+--------------------------------------+---------+--------+</code></pre>
</div>
</div>
<div class="paragraph">
<p>For Groovy 5, we are currently exploring the possibility of adding more
functions in GQuery, like <code>bar</code> and <code>stringAgg</code>. If you are interested in this
functionality, please get in touch via the
<a href="https://groovy-lang.org/mailing-lists.html">mailing lists</a> or
<a href="http://groovycommunity.com/">Slack</a>.</p>
</div>
</div>
</div></div></div></div></div><footer id='footer'>
<div class='row'>
<div class='colset-3-footer'>
<div class='col-1'>
<h1>Groovy</h1><ul>
<li><a href='https://groovy-lang.org/learn.html'>Learn</a></li><li><a href='https://groovy-lang.org/documentation.html'>Documentation</a></li><li><a href='/download.html'>Download</a></li><li><a href='https://groovy-lang.org/support.html'>Support</a></li><li><a href='/'>Contribute</a></li><li><a href='https://groovy-lang.org/ecosystem.html'>Ecosystem</a></li><li><a href='/blog'>Blog posts</a></li><li><a href='https://groovy.apache.org/events.html'></a></li>
</ul>
</div><div class='col-2'>
<h1>About</h1><ul>
<li><a href='https://github.com/apache/groovy'>Source code</a></li><li><a href='https://groovy-lang.org/security.html'>Security</a></li><li><a href='https://groovy-lang.org/learn.html#books'>Books</a></li><li><a href='https://groovy-lang.org/thanks.html'>Thanks</a></li><li><a href='http://www.apache.org/foundation/sponsorship.html'>Sponsorship</a></li><li><a href='https://groovy-lang.org/faq.html'>FAQ</a></li><li><a href='https://groovy-lang.org/search.html'>Search</a></li>
</ul>
</div><div class='col-3'>
<h1>Socialize</h1><ul>
<li><a href='https://groovy-lang.org/mailing-lists.html'>Discuss on the mailing-list</a></li><li><a href='https://twitter.com/ApacheGroovy'>Groovy on Twitter</a></li><li><a href='https://groovy-lang.org/events.html'>Events and conferences</a></li><li><a href='https://github.com/apache/groovy'>Source code on GitHub</a></li><li><a href='https://groovy-lang.org/reporting-issues.html'>Report issues in Jira</a></li><li><a href='http://stackoverflow.com/questions/tagged/groovy'>Stack Overflow questions</a></li><li><a href='http://groovycommunity.com/'>Slack Community</a></li>
</ul>
</div><div class='col-right'>
<p>
The Groovy programming language is supported by the <a href='http://www.apache.org'>Apache Software Foundation</a> and the Groovy community.
</p><div text-align='right'>
<img src='../img/asf_logo.png' title='The Apache Software Foundation' alt='The Apache Software Foundation' style='width:60%'/>
</div><p>Apache&reg; and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.</p>
</div>
</div><div class='clearfix'>&copy; 2003-2024 the Apache Groovy project &mdash; Groovy is Open Source: <a href='http://www.apache.org/licenses/LICENSE-2.0.html' alt='Apache 2 License'>license</a>, <a href='https://privacy.apache.org/policies/privacy-policy-public.html'>privacy policy</a>.</div>
</div>
</footer></div>
</div>
</div>
</div>
</div><script src='../js/vendor/jquery-1.10.2.min.js' defer></script><script src='../js/vendor/classie.js' defer></script><script src='../js/vendor/bootstrap.js' defer></script><script src='../js/vendor/sidebarEffects.js' defer></script><script src='../js/vendor/modernizr-2.6.2.min.js' defer></script><script src='../js/plugins.js' defer></script><script src='https://cdnjs.cloudflare.com/ajax/libs/prettify/r298/prettify.min.js'></script><script>document.addEventListener('DOMContentLoaded',prettyPrint)</script>
</body></html>