Skip to Main Content

Data Processing and Analysis




Data organisation in spreadsheets

Introduction to data cleaning with OpenRefine

Exercises for the Introduction to data cleaning

  1. Import the .csv file and create a project
  2. Change the case in a column and transform the value type to date in the column with dates. To do this, go to Edit cells > Common transforms. 
  3. Try out text and timeline facets and the text filter. 
  4. Split a column into two or more columns. 

Exercise 1

  1. Import the Wallonia dataset. 
  2. Use the green checkmark when you're done, under Reactions in Zoom.

Exercise 2

  1. Find values in the Label column containing the text "farm".
  2. What is the most common value in the column Province for these?

Exercise 3

  1. Split the geopoint column.
  2. Rename the columns.

Exercise 4

  1. Split Label with ( into max 2. 
  2. Split multi-valued cells in Label/Label 1 with -
  3. Trim whitespaces and change to lowercase. 
  4. Join multi-valued cells.

Exercise 5

  1. Replace the text in History to leave only the years. 
  2. Split multi-valued cells.
  3. Transform to date.
  4. Try out the timeline facet.