Networked Environment for
Music Analysis
INTRODUCTION
Phase I of the Networked Environment for Music
Analysis
(NEMA) framework project is a multinational, multidisciplinary
cyberinfrastructure project for music information processing that
builds upon
and extends the music information retrieval research being conducted by
the
International Music
Information Retrieval Systems Evaluation Laboratory
(IMIRSEL) at the University
of Illinois
at Urbana-Champaign (UIUC). NEMA brings together the
collective
projects and
the associated tools of six world leaders
in the domains of music
information
retrieval (MIR), computational musicology (CM) and e-humanities
research. The
NEMA team aims to create an open and extensible webservice-based
resource
framework that facilitates the integration of music data and
analytic/evaluative tools that can be used by the global MIR and CM
research
and education communities on a basis independent of time or location.
To help
achieve this goal, the NEMA team will be working co-operatively with
the
UIUC-based, Mellon-funded, Software
Environment for the Advancement of
Scholarly Research (SEASR) project to exploit SEASR’s expertise and
technologies in the domains of data mining and webservice-based
resource
framework development.
NEMA is being funded through a generous grant from
the Scholarly
Communications program of the Andrew
W. Mellon Foundation.
An abridged PDF version of the original proposal is
available: nema_abridged_proposal.pdf.
MOTIVATION
The Networked Environment for Music Analysis (NEMA)
project
was inspired by the lessons learned over the course of the
Mellon-funded Music
Information Retrieval/Music Digital Library Evaluation Project
(2003-2007)
being led by Prof. J. Stephen Downie and his IMIRSEL team at UIUC's Graduate School of
Library and Information Science (GSLIS). Downie’s
experience in running the annual Music Information Retrieval
Evaluation
eXchange (MIREX) on behalf of the MIR community has brought to the
fore three important issues that have a direct impact on the present
NEMA project. The automation, distribution
and integration of MIR and CM research tool
development, evaluation and
use are but some of the important issues being addressed under the NEMA
rubric.
VISION
NEMA
Phase I offers
the promise of a new and expanded
MIR/CM research paradigm. Under this new paradigm, it should become
possible for
MIR/CM researchers to overcome limitations of time-specific and
location-specific resources. In the new NEMA reality, for example, it
should become
common place for researchers at Lab A
to easily build a virtual collection from Library
B and Lab C, acquire the
necessary ground-truth from Lab D,
incorporate a feature extractor from Lab
E, amalgamate the extracted features with those provided by Lab F, build a set of models based on
pair of classifiers from Labs G and H
and then validate the results against
another virtual collection taken from Lab
I and Library J. Once completed,
the results and newly created features sets would be, in turn, made
available
for others to build upon.
Figure 1. The NEMA framework
model bringing the NEMA components together. The components listed in
the grey portion of the diagram are independent technologies being
developed by members of the NEMA team.
KEY GOALS
- Resource accessibility.
For example, new means to provide access to good ground-truth sets, to
broad-based music collections, to feature sets, and to pre-built
models, etc. must be found. Also, in the case of music collections
where items from the music collections will not be able to move about,
new ways of bringing researchers and their tools to the data need to be
constructed. It is important to envision a future where many different
collections of music materials are independently made available in such
a way as to create a much larger and diverse “super-collection.” Such
“super-collections” are needed to address the the current problem of
data "overuse" (i.e., the "overfitting" of models to small datasets).
They are also needed to allow for better scalability/stress testing of
approaches. Finally, new methods of creating and providing on-demand
computational and storage resources to the MIR/CM community need to be
explored.
- Resource discovery.
For example, even if the aforementioned resources were readily
available it is still necessary to create appropriate music-specific
location and discovery tools so that individual items or resource
subsets might be put to use.
- Resource
sharing/re-use. For example, new standards for ground-truth and
feature sets must be developed to facilitate their re-use. Mechanisms
need to put into place to make it easy for researchers to store or make
their own sets available to others. In the same manner, mechanisms must
be put in place to overcome the interoperability problems that limit
the re-use of research code, including feature extractors, classifiers,
and pre-built classification models, etc.
- Resource
customization. For example,
new ways need to be developed to help researchers amalgamate aspects of
independently produced feature sets to create novel feature sets. New
techniques must be found to easily create on-demand “virtual”
collections that span across several real-world collections regardless
of their physical location. Again, interoperability problems among
research code sets must be overcome so that researchers can create
customized hybrid systems that integrate tools from many different
research labs.
THE NEMA TEAM
Project Leadership
Principal Investigator: J.
Stephen Downie (UIUC)
Co-Principal Investigator: Ichiro
Fujinaga (McGill University)
Key Research Partners
David
De Roure (University of Southampton,
UK)
Mark Sandler
(Queen Mary, University of London, UK)
Tim Crawford (Goldsmiths,
University of London, UK)
David
Bainbridge (University of
Waikato, NZ)
Time Frame: 1
Jan 2008 to 31
December 2010