Join the Digital Humanities Initiative for a workshop with Joe Edgerton!
Trying to untangle messy data and need a simple tool to help you manage your data? OpenRefine can help! Participants will get a brief introduction to common data management issues and principles, an introduction to the OpenRefine user interface, then the instructor will walk the class through cleaning a dataset and demonstrate various methods (e.g., column splitting, transformations). Participants will have the option to work through parts of the demonstration and are encouraged to ask questions along the way. This is meant to be an introductory workshop and a safe environment to try new tools and work through problems that participants can then apply to their own research.
PRIOR TO THE WORKSHOP participants are encouraged to download OpenRefine. OpenRefine is a free, open-source Java application. Particpants can download OpenRefine from http://openrefine.org/download.html. Packages are available on https://openrefine.org/download.html for Windows, macOS, and Linux. Please download the latest stable version, choosing the option for your operating system.
Current versions for Windows (with embedded Java) and Mac include everything needed to run OpenRefine. For Windows, choose between an installer (.exe) file or a zip-archive. The Linux version requires a “Java Runtime Environment” (JRE) installed on your system (see notes below). If participants are using an older version of OpenRefine, it is recommended to upgrade to the latest tested version.
Please follow OpenRefine’s manual to install and run it. When running OpenRefine, initially a command line window will open. This is a window with a black background. As OpenRefine runs, lines of text will appear in the command line window. Then the Open Refine interface will open in your default web browser. Particpants do not need to interact with the command line window. Leave it open in the background, and work on datasets in your web browser.
If participants have any issues or questions, please feel free to email Joe Edgerton at zxc2sm@virginia.edu to troubleshoot.
As a librarian Joe Edgerton is an advocate of research materials best practices and a supporter of the people who use them. He understands folks have a variety of backgrounds with data management, and he wants to approach data management with care and understanding for both scholars and users of data.