Python library for audio and music analysis

Overview

librosa

A python package for music and audio analysis.

PyPI Anaconda-Server Badge License DOI

CI codecov

Documentation

See https://librosa.org/doc/ for a complete reference manual and introductory tutorials.

The advanced example gallery should give you a quick sense of the kinds of things that librosa can do.

Installation

The latest stable release is available on PyPI, and you can install it by saying

pip install librosa

Anaconda users can install using conda-forge:

conda install -c conda-forge librosa

To build librosa from source, say python setup.py build. Then, to install librosa, say python setup.py install. If all went well, you should be able to execute the demo scripts under examples/ (OS X users should follow the installation guide given below).

Alternatively, you can download or clone the repository and use pip to handle dependencies:

unzip librosa.zip
pip install -e librosa

or

git clone https://github.com/librosa/librosa.git
pip install -e librosa

By calling pip list you should see librosa now as an installed package:

librosa (0.x.x, /path/to/librosa)

Hints for the Installation

librosa uses soundfile and audioread to load audio files. Note that soundfile does not currently support MP3, which will cause librosa to fall back on the audioread library.

soundfile

If you're using conda to install librosa, then most audio coding dependencies (except MP3) will be handled automatically.

If you're using pip on a Linux environment, you may need to install libsndfile manually. Please refer to the SoundFile installation documentation for details.

audioread and MP3 support

To fuel audioread with more audio-decoding power (e.g., for reading MP3 files), you may need to install either ffmpeg or GStreamer.

Note that on some platforms, audioread needs at least one of the programs to work properly.

If you are using Anaconda, install ffmpeg by calling

conda install -c conda-forge ffmpeg

If you are not using Anaconda, here are some common commands for different operating systems:

  • Linux (apt-get): apt-get install ffmpeg or apt-get install gstreamer1.0-plugins-base gstreamer1.0-plugins-ugly
  • Linux (yum): yum install ffmpeg or yum install gstreamer1.0-plugins-base gstreamer1.0-plugins-ugly
  • Mac: brew install ffmpeg or brew install gstreamer
  • Windows: download ffmpeg binaries from this website or gstreamer binaries from this website

For GStreamer, you also need to install the Python bindings with

pip install pygobject

Discussion

Please direct non-development questions and discussion topics to our web forum at https://groups.google.com/forum/#!forum/librosa

Citing

If you want to cite librosa in a scholarly work, there are two ways to do it.

  • If you are using the library for your work, for the sake of reproducibility, please cite the version you used as indexed at Zenodo:

    DOI

  • If you wish to cite librosa for its design, motivation etc., please cite the paper published at SciPy 2015:

    McFee, Brian, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. "librosa: Audio and music signal analysis in python." In Proceedings of the 14th python in science conference, pp. 18-25. 2015.

Issues
  • Multirate Filterbank from Chroma Toolbox

    Multirate Filterbank from Chroma Toolbox

    Hi everyone,

    following #394, this is a first start on integrating the multirate filterbank which is used in the chroma toolbox.

    I have another notebook where I show the whole processing chain: https://github.com/stefan-balke/mpa-exc/blob/master/02_fourier_transform/pitch_filterbank.ipynb

    As for implementation, it could live as another "spectral representation" in spectrum.py, although its more a CQT–but I guess we leave cqt.py reserved for the actual transform. Parameter-wise, this one is pretty much fixed, although one could add a parameter for "detuning", meaning another reference frequency than 440 Hz @ A4.

    But that's up for discussion.

    As next steps, I would add some unit tests comparing the filter coefficients to the Chroma Toolbox and a function which actually calls this filterbank...should be straight-forward!


    This change is Reviewable

    enhancement functionality 
    opened by stefan-balke 89
  • DTW

    DTW

    Hey all,

    I mainly took @craffel dijtw source code and merged it with mine. As dicussed in #298, we want the following features:

    • [x] Arbitrary step sizes
    • [x] Additive or multiplicative local weights for the steps
    • [x] Subsequence (so it can be used for matching)
    • [x] Global path constraints (e.g. Sakoe-Chiba band etc.)
    • [x] make numba optional (cf. Brian's comment)
    • [x] test backtracking explicitly
    • [x] plot D + wp (in the examples)
    • ~~Gullying~~

    After finishing the features (which are in dijtw), we need more tests and some final notebook with benchmarking the implementation against "vanilla-vanilla" dtw.


    This change is Reviewable

    functionality 
    opened by stefan-balke 84
  • YIN and pYIN

    YIN and pYIN

    Reference Issue

    Fixes #527

    What does this implement/fix? Explain your changes.

    This pull request implements the YIN and pYIN algorithms for pitch tracking. The YIN function is based on @lostanlen's PR #974 (with a few modifications) while the pYIN function is based on this paper.

    Any other comments?

    Both functions work well but some refactoring is definately needed. I compared the outputs of pYIN to the official vamp plugin and the results are comparable. I haven't added any special treatment of low amplitude frames yet so silent frames with periodic noice occasionally give wrong results.

    Also, note that I haven't used the librosa.core.autocorrelate function. The librosa.core.autocorrelate computes equation (2) in the YIN paper which seems to perform worse than the one used here (which computes equation (1)). I tried scaling by the window size (as suggested in the paper) but it didn't improve things much.

    I haven't implemented any tests yet so it would be good to get some guidance there.

    functionality 
    opened by bshall 59
  • DTW

    DTW

    Hey there,

    I recently did a DTW implementation based on Meinard's book with some Cython speed up for the dynamic programming. As this is not yet reflected in librosa, I wondered if we could do a module for music syncronization.

    Best Stefan

    functionality discussion 
    opened by stefan-balke 45
  • CQT length scale normalization

    CQT length scale normalization

    This PR implements #412, and a couple of bug-fixes. Summary of contents:

    • Added a scale boolean option to all CQT methods. If enabled, CQT bands are normalized by sqrt(n[i]) where n[i] is the length of the ith filter. This is analogous to norm='ortho' mode in np.fft.fft.
    • Early downsampling is less aggressive by one octave. Previously, it was too aggressive, and the top-end of each octave was close to nyquist, which resulted in attenuation.
    • Magnitude is now continuous and approximately equal for full, hybrid, and pseudo CQT. This fixes some of the unresolved continuity errors noted in #347.
    • Expanded the CQT unit tests to include a white noise continuity test.

    Using scale=True means that white noise input will look like white noise output in CQT space (flat spectrum). With scale=False (current default), white noise in looks like 1/sqrt(f).

    scale=True makes an impulse look like sqrt(f) as opposed to constant for scale=False.


    This change is Reviewable

    bug enhancement functionality API change Hacktoberfest 
    opened by bmcfee 41
  • Conda packaging

    Conda packaging

    Providing a conda distribution might save some headaches at install time, especially when it comes to codec dependencies.

    enhancement management 
    opened by bmcfee 33
  • Harmonics

    Harmonics

    This PR implements a harmonic estimator.

    This can be used as a subroutine to estimate pitch or tempo salience.


    This change is Reviewable

    enhancement functionality Hacktoberfest 
    opened by bmcfee 31
  • Default sample rate for librosa.core.load is not the native sample rate

    Default sample rate for librosa.core.load is not the native sample rate

    Currently, the default sample rate is 22050 Hz when loading an audio file. In order to use the native sample rate, sr must be set to None. I often forget to set this to None when I want the native sample rate, and I have noticed many students do the same.

    Have others experienced confusion over this? Should this change in a later release so that the native sample rate is the default?

    discussion IO 
    opened by mcartwright 31
  • Added optional pyFFTW backend

    Added optional pyFFTW backend

    As proposed in https://github.com/librosa/librosa/issues/353 scipy.fftpack is a terrible bottleneck and supporting FFTW would be immensely useful. This pull request adds an optional wrapper for FFTW. If pyFFTW is installed librosa will prefer that over scipy.fftpack. Everything will work as normal if the user is missing pyFFTW.

    @bmcfee, thoughts? The speedup is substantial, and as librosa depends on STFT all over the place (even for rmse) I feel this is a necessary addition, and a lot nicer than having users monkey patch outside of librosa.


    This change is Reviewable

    enhancement 
    opened by carlthome 30
  • RFC: more example tracks?

    RFC: more example tracks?

    It's been mentioned on numerous occasions that our current example track, catchy as it may be, is not great for demonstrating many of the functions of librosa. In offline discussions, @lostanlen and I have talked about extending the included audio content to have several tracks, which could be used to demonstrate different functionality. So I want to kick this out to the community: what do you want to see in our examples? By this I mean: please recommend specific recordings.

    To prevent this discussion from becoming an infinite bike shed, I'm going to lay down some ground rules for inclusion:

    1. Content must be CC-licensed or public domain.
    2. Total content should not bloat the package too much; I think 10MB is a reasonable upper bound, and this functionally limits us to between 5 and 10 total recordings.
    3. No single track should be too long. It's okay to have some very short examples.
    4. The total collection should be diverse in terms of style, instrumentation, polyphony, etc. If possible, I'd like to include some non-western recordings as well.
    5. Any lyrical content should not be offensive, for some reasonable definition of offensive which is compatible with our CoC. I don't expect any problems here, but I'll reserve executive privilege here to veto anything that could be problematic.
    6. Familiarity would be a bonus, for making the examples and documentation more immediately accessible.

    With all of that out of the way, let's talk about things not currently demonstrated by our current example. We don't have to hit all of these, and I'm sure I'm missing some, but we should aim to hit most of them.

    • Monophonic audio: we should have at least one solo instrument recording that can be used to demonstrate things like pitch tracking. Maybe a raga or makam could be good here? This could also be good for demonstrating onsets.
    • Interesting harmony: the current example is pretty boring, harmonically speaking. Maybe a jazz recording would be appropriate here? Maybe something with some key changes as well.
    • Non-percussive rhythmic elements: something classical (strings) would be nice to have for demonstrating onset and beat tracking with soft attacks.
    • Different time signatures: examples with 4/4, 3/4, and maybe a 5/4 or 7/8 would be nice for demonstrating some of the rhythmic features (tempogram, fmt)
    • Vocals and instrumentals: we should have at least one track with vocals.
    • Non-musical audio: do we need/want this? Speech? Environmental sound? Librosa gets used for these things, so it might be worth considering their inclusion.

    Some other discussions that we can have around this:

    • Should we have a multi-track / stem set as one of the examples?
    • What kind of genre coverage should we strive for?
    • How does this issue interact with #641 (non-western systems in display)? Is #641 a pre-requisite for including non-western examples? (I'd argue that it should be, and that this would be a good motivating factor for finally doing it.)

    Finally, the examples gallery already includes a few candidate options here. We can take some, all, or none of these, but whatever we decide on including should serve as plausible replacements for them and the example notebooks should be revised afterward.

    discussion management 
    opened by bmcfee 30
  • Framing fails with ndim>2

    Framing fails with ndim>2

    Describe the bug This came up in testing #1351 . #944 implemented multidimensional framing, and the tests cover the case up to input of 2 dimensions. However, when ndim is 3 or larger, framing no longer behaves correctly, and pulls garbage in from memory.

    To Reproduce

    >>> import numpy as np
    >>> import librosa
    
    >>> # Create a 4d input and slice it down
    >>> x4 = np.random.randn(2, 3, 4, 100)
    >>> x3 = x4[0]
    >>> x2 = x3[0]
    >>> x1 = x2[0]
    
    >>> # Frame each version with the same parameters
    >>> x4f = librosa.util.frame(x4, 10, 1)
    >>> x3f = librosa.util.frame(x3, 10, 1)
    >>> x2f = librosa.util.frame(x2, 10, 1)
    >>> x1f = librosa.util.frame(x1, 10, 1)
    
    >>> # We should have x1f = x2f[0, :] = x3f[0, 0, :] = x4f[0,0,0, :]
    >>> np.allclose(x1f, x2f[0])
    True
    
    >>> np.allclose(x1f, x3f[0, 0])
    False
    
    >>> np.allclose(x2f, x3f[0])
    False
    

    Expected behavior Leading slicing should commute with trailing framing (and vice versa).

    Software versions*

    Linux-5.8.8s1+-x86_64-with-glibc2.33
    Python 3.9.7 | packaged by conda-forge | (default, Sep  2 2021, 17:58:34) 
    [GCC 9.4.0]
    NumPy 1.20.3
    SciPy 1.7.1
    librosa 0.8.1
    INSTALLED VERSIONS
    ------------------
    python: 3.9.7 | packaged by conda-forge | (default, Sep  2 2021, 17:58:34) 
    [GCC 9.4.0]
    
    librosa: 0.8.1
    
    audioread: 2.1.9
    numpy: 1.20.3
    scipy: 1.7.1
    sklearn: 0.24.2
    joblib: 1.0.1
    decorator: 5.0.9
    soundfile: 0.9.0
    resampy: 0.2.2
    numba: 0.54.0
    
    numpydoc: 1.1.0
    sphinx: 4.1.2
    sphinx_rtd_theme: 0.5.1
    sphinxcontrib.versioning: None
    sphinx-gallery: None
    pytest: 6.2.5
    pytest-mpl: None
    pytest-cov: None
    matplotlib: 3.4.3
    presets: None
    

    Additional context It's entirely possible that something is awry in how we're forcing contiguity within frame. I thought I understood this in general, but maybe I was mistaken when implementing #966

    bug 
    opened by bmcfee 4
  • Multichannel display

    Multichannel display

    We've previously decided that #1130 does not directly include modifications to the display module, and that those functions should remain mono-only. However, this doesn't rule out the possibility of adding new functionality to support multichannel visualizations easily, and I think this would be quite helpful in general.

    Describe the solution you'd like

    This is easiest to consider starting from wave plots. In the common case of 2-channel signals, I think it would be helpful to have a plot that renders a joined pair of subplots with one channel each and shared axes, like you might see in a DAW.

    Here's some prototype code that does this already:

    fig = plt.figure(figsize=(10, 5), constrained_layout=False)
    
    grid = fig.add_gridspec(2, 1, wspace=0, hspace=0)
    axes = grid.subplots(sharex=True, sharey=True)
    
    librosa.display.waveshow(y[0], sr=sr, ax=axes[0])
    librosa.display.waveshow(y[1], sr=sr, ax=axes[1])
    axes[0].label_outer()
    axes[0].set(xlabel=None)
    fig.show()
    

    Which produces the following image with the trumpet example: image

    Note that this code would break the current pattern of display functions accepting target axes, since it will need to create an axes for each channel. I think this is fine, and we can still accept fig as a target or create a new one as needed.

    We also have to do a little bit of cleanup here to hide the xlabel and ticks. (Side note: maybe waveshow and specshow should have options to not set axis labels.)

    This approach would easily generalize to C>2 channels, though if we have higher order arrangements we might get into trouble and have to flatten the array down in advance.

    Some parameters we should expose (with defaults):

    • wspace,hspace=0: spacing between subplots. I think most people would expect vertical packing as i've done in this example, but it should be configurable.
    • sharex,sharey=True: subplot axis coupling. This seems generally desirable, but maybe there are use cases where it would not be helpful.
    • waveshow parameters (passthrough)

    The function would then return, I guess, the grid object? This seems to be how seaborn does it for a similar use-case, and they seem pretty on-the-ball.

    We could do similar things for specshow, though we might not want to default to dense packing there because specshow plots don't have the benefit of negative visual space like waves do.

    Describe alternatives you've considered We could also just punt this and leave people to their own devices, but that seems like a disservice.

    functionality API change 
    opened by bmcfee 1
  • effects.trim and split should aggregate energy across channels

    effects.trim and split should aggregate energy across channels

    BEFORE POSTING A BUG REPORT Please look through existing issues (both open and closed) to see if it's already been reported or fixed!

    Describe the bug The function Trim and Split do not handle multichannel well. All channels are combined into one signal using core.to_mono. This is may be a problem in several types of multichannel signals (for example when the two channels are out of phase). These functions must be updated to allow for each channel's signal to be processed individually.

    bug API change 
    opened by dafaronbi 2
  • specshow's xticklabels do not print hour/minutes for short signals with offset

    specshow's xticklabels do not print hour/minutes for short signals with offset

    Describe the bug I am using librosa.specshow with x_axis="time" to display spectrograms. I think i've found a bug in the xticklabels. If the signal is less than one minute long (e.g., Brahms), it all goes well. If it is more than one minute long and i'm showing all of it, it also goes well. But if the signal is 90 seconds long and i want to show the last 40 seconds of it, librosa does not print xticklabels in mm:ss format but in scientific notation. (see below)

    To Reproduce

    Example:

    import librosa
    from matplotlib import pyplot as plt
    
    def specshow_with_offset(y, sr, offset):
        S = librosa.power_to_db(np.abs(librosa.stft(y)))
        times = librosa.times_like(S, sr=sr)
        librosa.display.specshow(
            S, sr=sr, x_coords=times+offset, x_axis="time",
            y_axis="hz")
        
    plt.figure(figsize=(6, 6))
    y, sr = librosa.load(librosa.ex("brahms"))
    for fig_id, offset in enumerate([0, 60, 120]):
        plt.subplot(3, 1, 1+fig_id)
        specshow_with_offset(y, sr, offset)
    

    Expected behavior Use mm:ss or hh:mm:ss notations depending on the values taken by x_coords rather than depending on the duration of the input.

    Screenshots Screenshot 2021-08-17 at 17 23 56

    Software versions*

    macOS-10.15.7-x86_64-i386-64bit
    Python 3.8.10 (default, May 19 2021, 11:01:55) 
    [Clang 10.0.0 ]
    NumPy 1.21.0
    SciPy 1.4.1
    librosa 0.8.1
    INSTALLED VERSIONS
    ------------------
    python: 3.8.10 (default, May 19 2021, 11:01:55) 
    [Clang 10.0.0 ]
    
    librosa: 0.8.1
    
    audioread: 2.1.9
    numpy: 1.21.0
    scipy: 1.4.1
    sklearn: 0.22.1
    joblib: 1.0.1
    decorator: 5.0.9
    soundfile: 0.9.0
    resampy: 0.2.2
    numba: 0.48.0
    
    

    Additional context Related: #737 #760

    bug API change 
    opened by lostanlen 4
  • Deemphasis modifies signals in place

    Deemphasis modifies signals in place

    Describe the bug Effects.deemphasis performs an in-place subtraction on the first sample of the input signal to handle the initial boundary condition.

    It should not do this, and in general we should not modify input signals.

    Relatedly, I think the handling of zi is incorrect in general, and not actually usable for it's intended purpose of incremental block processing. This is because there is no way for the user to access the final state zf generated by lfilter, and would therefore never have a value to pass in other than None.

    The obvious solution, I would think, is to pass in zi directly rather than subtract it from y[0] on input. However, doing this with the initial zi we currently use produces incorrect output and does not correctly invert the prermphasis filter.

    I'd be curious for opinions on the best way around this: @lostanlen any thoughts?

    Software versions* 0.8.1

    bug API change 
    opened by bmcfee 9
  • Infer default fmax in melspectrogram from sampling rate

    Infer default fmax in melspectrogram from sampling rate

    Is your feature request related to a problem? Please describe. When computing a tempogram with a sample rate lower than 22050 I get the following UserWarning:

    >>> import librosa
    >>> import numpy
    
    >>> sr = 8000  # using sample rate lower than 22050
    >>> duration = 10.
    >>> y = numpy.random.rand(int(sr * duration))
    >>> librosa.feature.tempogram(y, sr=sr)
    UserWarning: Empty filters detected in mel frequency basis. Some channels will produce empty responses. Try increasing your sampling rate (and fmax) or reducing n_mels.
    

    From the UserWarning it was not directly obvious to me, where the mel filters where even computed and how to change this behavior.

    Describe the solution you'd like Inside onset_strength_multi the

    "Function for computing time-series features, eg, scaled spectrograms. By default, uses librosa.feature.melspectrogram with fmax=11025.0"

    (as of the docs) uses as stated a hard-coded fmax of 11025.0. I would suggest to use the available sample-rate argument inside onset_strength_multi and use sr/2 instead of a hard-coded 11025.0.

    if feature is None:
       feature = melspectrogram
       kwargs.setdefault("fmax", sr / 2) # instead of kwargs.setdefault("fmax", 11025.0) 
    

    This would get rid of the UserWarnings when working with samplerates lower than 22050 and I don't see a scenario in which a hard-coded version is more desireable.

    Describe alternatives you've considered There is the option to circumvent the UserWarning by precomputing the onset_envelope and passing the fmax via the kwargs:

    >>> sr, duration = 8000, 10.  
    >>> y = numpy.random.rand(int(sr * duration))
    
    >>>oenv = librosa.onset.onset_strength(y=y, sr=sr, fmax=sr/2)  #precompute onset envelope and pass `fmax` via kw 
    >>>tempogram = librosa.feature.tempogram(onset_envelope=oenv, sr=sr)
    

    I appreciate this option, but it took me a while to figure out ;)

    Thanks for considering my suggestion, I am also open to hear how hard-coding 11025 makes more sense!

    Additional context

    Linux-5.11.0-7620-generic-x86_64-with-glibc2.2.5
    Python 3.8.8 (default, Nov 10 2011, 15:00:00) 
    [GCC 10.2.0]
    NumPy 1.19.5
    SciPy 1.5.4
    librosa 0.8.1
    INSTALLED VERSIONS
    ------------------
    python: 3.8.8 (default, Nov 10 2011, 15:00:00) 
    [GCC 10.2.0]
    
    librosa: 0.8.1
    
    audioread: 2.1.9
    numpy: 1.19.5
    scipy: 1.5.4
    sklearn: 0.24.1
    joblib: 1.0.1
    decorator: 4.4.2
    soundfile: 0.10.3
    resampy: 0.2.2
    numba: 0.53.1
    
    numpydoc: None
    sphinx: None
    sphinx_rtd_theme: None
    sphinxcontrib.versioning: None
    sphinx-gallery: None
    pytest: None
    pytest-mpl: None
    pytest-cov: None
    matplotlib: 3.4.2
    presets: None
    
    API change 
    opened by JKybelka 1
  • _spectrogram helper returns incorrect n_fft for odd lengths

    _spectrogram helper returns incorrect n_fft for odd lengths

    Describe the bug

    Much of the librosa.features submodule relies upon a helper function librosa.core.spectrum._spectrogram, which allows us to easily support both time-domain and time-frequency-domain inputs for spectral features. When y is given but S is not, a spectrogram is constructed and returned, along with n_fft. When S is given, no new spectrogram is computed, but n_fft is inferred from its shape: https://github.com/librosa/librosa/blob/d9b72af964820fc796425c55d3dccd7f6b908bf9/librosa/core/spectrum.py#L2505-L2507

    The problem here has been noted elsewhere, notably #978 , where we're implicitly assuming an even frame length. Most of the time this doesn't matter, except when it does. :upside_down_face:

    In general, we can't accurately infer n_fft since the floor operation is inherently non-invertible. However, we could at least check if the provided n_fft is consistent with S.shape, and return it as is if so. If it is not, a warning should be raised. This way, a user can at least override the incorrect inference with an explicit value and get coherent results.

    Proposed fix Replace https://github.com/librosa/librosa/blob/d9b72af964820fc796425c55d3dccd7f6b908bf9/librosa/core/spectrum.py#L2507 by:

    if n_fft // 2 + 1 != S.shape[-2]:  # frequency axis is -2, not 0, to support multichannel inputs
       warnings.warn('Uhoh...')
    

    Software versions*

    All versions up through 0.8.1.

    bug 
    opened by bmcfee 1
  • Harmonic interpolation can produce nans

    Harmonic interpolation can produce nans

    Describe the bug

    This issue popped up during #1351.

    Our harmonic interpolation function uses scipy.interpolate.interp1d under the hood, and then evaluates the interpolator at different points to produce the harmonic spectrogram planes. By default, we use kind='linear', and this works fine when the center frequencies are all unique (eg, fft_frequencies) so that the interpolation is well defined.

    However, when using a frequency-reassigned spectrum, we may not have unique frequency values at adjacent bins, which produces an ill-defined interpolation and the following warnings:

    /home/bmcfee/miniconda/envs/py39/lib/python3.9/site-packages/scipy/interpolate/interpolate.py:623: RuntimeWarning: divide by zero encountered in true_divide
      slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]
    /home/bmcfee/miniconda/envs/py39/lib/python3.9/site-packages/scipy/interpolate/interpolate.py:626: RuntimeWarning: invalid value encountered in multiply
      y_new = slope*(x_new - x_lo)[:, None] + y_lo
    

    To Reproduce

    import librosa
    
    y, sr = librosa.load(librosa.ex('trumpet'))
    freqs, times, mags = librosa.reassigned_spectrogram(y, sr=sr, fill_nan=True)
    
    mags_h = librosa.interp_harmonics(mags, freqs, h_range=[1,2,3])
    
    # The following assertion should pass
    assert np.all(np.isfinite(mags_h))
    

    This is caused by a repeated frequency value, which we can detect by differencing along the frequency axis:

    >>> np.diff(freqs, axis=0)
    Out[17]: 
    array([[ 3.62758875,  4.69766474,  0.        , ..., 10.76660156,
            10.76660156, 10.76660156],
    ...
    

    Expected behavior All interpolated magnitudes should be finite.

    Proposed solution This behavior vanishes if we switch to a spline interpolator, eg, kind='slinear'.

    I propose that we switch the default interpolation mode to slinear in 0.9, and add a note to docstring for interp_harmonics to discourage people from using linear if frequencies are not guaranteed to be unique.

    Alternatively, we could check for uniqueness of frequencies at runtime, but this seems A) a bit out of scope (it should be the interpolator's job), and B) subject to numerical precision issues. The spline interpolation modes work well in this regime, so we may as well encourage their use.

    Software versions* Affects all versions up through 0.8.1.

    bug API change 
    opened by bmcfee 1
  • librosa.display.specshow does not display the correct time scale

    librosa.display.specshow does not display the correct time scale

    When trying to display a chromagram using librosa.display.specshow, the time scale is not correct. The time scale is correct when the hop_length is its default value (512). However, it is incorrect whenever the hop_length is changed. (The input length increases as I increase the hop_length).

    Code I used to generate and plot the chromagram:

    # 512 is the default, when this number is altered, the time scale is wrong
    HOP_SIZE = 512
    
    y, sr = librosa.load(mp3)
    chroma_feature = librosa.feature.chroma_stft(y=y, sr = sr, hop_length = HOP_SIZE)
    
    fig, ax = plt.subplots()
    img = librosa.display.specshow(chroma_feature, y_axis='chroma', x_axis='time', ax=ax, sr = sr, hop_length = HOP_SIZE)
    fig.colorbar(img, ax=ax)
    plt.show()
    
    enhancement 
    opened by clare228 12
  • (1, n_samples) is not accepted as a mono waveform

    (1, n_samples) is not accepted as a mono waveform

    Description Librosa typically operates on one-dimensional, one-channel signals whose shape is (n_samples,). In some cases, stereo signals are accepted in "time-major" format, e.g. of shape (2, n_samples). But the analogous single-channel signal, whose shape is (1, n_samples), is rejected -- at least by numpy.resample()

    To Reproduce

    import librosa
    import numpy as np
    
    # Conventional mono
    mono_signal = np.random.randn(1000)
    resampled_mono = librosa.resample(mono_signal, 1000, 2000)
    
    # Conventional stereo
    stereo_signal = np.random.randn(2, 1000)
    resampled_stereo = librosa.resample(stereo_signal, 1000, 2000)
    
    # Stereo-like single channel
    onechan_signal = np.random.randn(1, 1000)
    resampled_onechan = librosa.resample(onechan_signal, 1000, 2000)
    
    

    Expected behavior resampled_mono == resampled_onechan[0]

    Actual behavior

    ---------------------------------------------------------------------------
    ParameterError                            Traceback (most recent call last)
    <ipython-input-7-3776517cab8b> in <module>()
          1 onechan_signal = np.random.randn(1, 1000)
    ----> 2 resampled_onechan = librosa.resample(onechan_signal, 1000, 2000)
          3 print(resampled_onechan.shape)
    
    1 frames
    /usr/local/lib/python3.7/dist-packages/librosa/util/utils.py in valid_audio(y, mono)
        304     elif y.ndim == 2 and y.shape[0] < 2:
        305         raise ParameterError(
    --> 306             "Mono data must have shape (samples,). " "Received shape={}".format(y.shape)
        307         )
        308 
    
    ParameterError: Mono data must have shape (samples,). Received shape=(1, 1000)
    

    Software versions

    Linux-5.4.104+-x86_64-with-Ubuntu-18.04-bionic
    Python 3.7.10 (default, May  3 2021, 02:48:31) 
    [GCC 7.5.0]
    NumPy 1.19.5
    SciPy 1.4.1
    librosa 0.8.1
    INSTALLED VERSIONS
    ------------------
    python: 3.7.10 (default, May  3 2021, 02:48:31) 
    [GCC 7.5.0]
    
    librosa: 0.8.1
    
    audioread: 2.1.9
    numpy: 1.19.5
    scipy: 1.4.1
    sklearn: 0.22.2.post1
    joblib: 1.0.1
    decorator: 4.4.2
    soundfile: 0.10.3
    resampy: 0.2.2
    numba: 0.51.2
    
    numpydoc: None
    sphinx: 1.8.5
    sphinx_rtd_theme: None
    sphinxcontrib.versioning: None
    sphinx-gallery: None
    pytest: 3.6.4
    pytest-mpl: None
    pytest-cov: None
    matplotlib: 3.2.2
    presets: None
    
    discussion 
    opened by dpwe 12
Releases(0.8.1)
  • 0.8.1(May 26, 2021)

    This is primarily a bug-fix and maintenance release.

    New features include interactive waveform visualization, signal de-emphasis effect, and expanded resampling modes.

    A full list of changes can be found at https://librosa.org/doc/main/changelog.html#v0-8-1

    Source code(tar.gz)
    Source code(zip)
  • 0.8.1rc2(May 25, 2021)

  • 0.8.1rc1(May 23, 2021)

    First release candidate for 0.8.1.

    This is primarily a bug-fix and maintenance release. A full list of changes can be found at https://librosa.org/doc/main/changelog.html#v0-8-1

    Source code(tar.gz)
    Source code(zip)
  • 0.8.0(Jul 21, 2020)

    First release of the 0.8 series.

    Major changes include:

    • Removed support for Python 3.5 and earlier.
    • Added pitch tracking (yin and pyin)
    • Variable-Q transform
    • Hindustani and Carnatic notation support
    • Expanded collection of example tracks
    • Numerous speedups and bugfixes
    Source code(tar.gz)
    Source code(zip)
  • 0.7.2(Jan 13, 2020)

    This is primarily a bug-fix release, and most likely the last release in the 0.7 series.

    It includes fixes for errors in dynamic time warping (DTW) and RMS energy calculation, and several corrections to the documentation.

    Inverse-liftering is now supported in MFCC inversion, and an implementation of mu-law companding has been added.

    Please refer to the documentation for a full list of changes.

    Source code(tar.gz)
    Source code(zip)
  • 0.7.1(Oct 9, 2019)

    This minor revision includes mainly bug fixes, but there are a few new features as well:

    • Griffin-Lim for constant-Q spectra
    • Multi-dimensional in-place framing
    • Enhanced compatibility with HTK for MFCC generation
    • Time-frequency reassigned spectrograms

    Please refer to the documentation for a full list of changes.

    Source code(tar.gz)
    Source code(zip)
  • 0.7.0(Jul 7, 2019)

    First release of the 0.7 series.

    Major changes include streaming mode, feature inversion, faster decoding, more efficient spectral transformations, and numerous API enhancements.

    Source code(tar.gz)
    Source code(zip)
  • 0.7.0rc1(Jul 1, 2019)

    First release candidate of the 0.7 series.

    Major changes include streaming mode, faster decoding, more efficient spectral transformations, and numerous API enhancements.

    Source code(tar.gz)
    Source code(zip)
  • 0.6.3(Feb 13, 2019)

  • 0.6.2(Aug 9, 2018)

  • 0.6.1(May 24, 2018)

    0.6.1 final release. This contains no substantial changes from 0.6.1rc0.

    The major changes from 0.6.0 include:

    • new module librosa.sequence for Viterbi decoding
    • Per-channel energy normalization (librosa.pcen())

    As well as numerous bug-fixes and acceleration enhancements.

    Source code(tar.gz)
    Source code(zip)
  • 0.6.1rc0(May 22, 2018)

    First release candidate for 0.6.1.

    This is primarily a bugfix release, though two new features have been added: per-channel energy normalization (pcen) and Viterbi decoding (librosa.sequence module).

    Source code(tar.gz)
    Source code(zip)
  • 0.6.0(Feb 17, 2018)

  • 0.6.0rc1(Feb 13, 2018)

  • 0.6.0rc0(Feb 10, 2018)

    First release candidate for 0.6.

    This is a major revision, and contains numerous bugfixes and some small API changes that break backward compatibility with the 0.5 series. A full changelog is provided in the documentation.

    Source code(tar.gz)
    Source code(zip)
  • 0.5.1(May 8, 2017)

  • 0.5.0rc0(Feb 11, 2017)

  • 0.4.3rc0(May 15, 2016)

  • 0.4.2(Feb 20, 2016)

  • 0.4.1(Oct 16, 2015)

    This minor revision expands the rhythm analysis functionality, and fixes several small bugs.

    It is also the first release to officially support Python 3.5.

    For a complete list of changes, refer to the CHANGELOG.

    Source code(tar.gz)
    Source code(zip)
  • 0.4.1rc0(Oct 13, 2015)

  • 0.4.0rc2(May 22, 2015)

  • 0.4.0rc1(Mar 4, 2015)

    There are still a few issues to clean up with the 0.4 milestone, but these mainly relate to testing.

    This rc should be essentially feature complete.

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jan 22, 2014)

Owner
librosa
Python tools for music and audio analysis
librosa
Typographic Beat-Oriented Notation for music

tbon Typographic Beat-Oriented Notation for music Tbon aims to be the fastest way to enter pitches, rhythms, meter and dynamic levels from a computer

null 11 Jan 23, 2020
Music typeset with the Lilypond system

Intro (from long ago) This repo contains sheet music typeset with the Lilypond typesetter. The music chosen is in favour of cello music (mostly chambe

Enthusiastic about  the Cello 95 Sep 15, 2021
网易云音乐第三方

ieaseMusic Elegant NeteaseMusic desktop app, Rock with NeteaseMusic ?? Built by Electron, React, MobX, JSS API 由 Binaryify/NeteaseCloudMusicApi 提供。 Pr

null 8.6k Sep 22, 2021
OpenSheetMusicDisplay renders sheet music in MusicXML format in your web browser based on VexFlow. OSMD is brought to you by PhonicScore.com.

OpenSheetMusicDisplay (OSMD) A MusicXML renderer for the Browser opensheetmusicdisplay.org About OSMD • Demo • Key Features • Limitations • How to Use

Open Sheet Music Display 800 Sep 15, 2021
A language for music notation

Lydown is a language and compiler for creating music scores, parts and snippets. The lydown code is compiled to lilypond code and then compiled to PDF

Sharon Rosner 21 Apr 8, 2021
🎵 Music notation engraving library for MEI with MusicXML and Humdrum support and various toolkits (JavaScript, Python)

Verovio is a fast, portable and lightweight library for engraving Music Encoding Initiative (MEI) digital scores into SVG images. Verovio also contain

RISM Digital Center 426 Sep 22, 2021
🎚️ Open Source Audio Matching and Mastering

Matching + Mastering = ❤️ Matchering 2.0 is a novel Containerized Web Application and Python Library for audio matching and mastering. It follows a si

Sergey Grishakov 489 Sep 14, 2021
Python library for audio and music analysis

librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut

librosa 4.7k Sep 15, 2021
Mopidy is an extensible music server written in Python

Mopidy Mopidy is an extensible music server written in Python. Mopidy plays music from local disk, Spotify, SoundCloud, Google Play Music, and more. Y

Mopidy 7.2k Sep 17, 2021
A Music programming language. Translates source code into MIDI. Includes a player. Supports MIDI-Karaoke. Includes a MIDI analyzer.

Get Started | Features | Screenshots | Programming | CLI | Contribute | License Midica is an interpreter for a Music Programming Language. It translat

Jan Trukenmüller 47 Sep 17, 2021
music library manager and MusicBrainz tagger

beets Beets is the media library management system for obsessive music geeks. The purpose of beets is to get your music collection right once and for

beetbox 10.4k Sep 15, 2021
Music player for deepin desktop environment.

deepin-music Deepin music is a local music player with beautiful design and simple functions developed by Deepin Technology. Dependencies Build depend

Wuhan Deepin Technology Co.,Ltd. 155 Sep 16, 2021
Music player and music library manager for Linux, Windows, and macOS

Quod Libet: an audio library, manager & player Quod Libet is a cross-platform audio / music management program. It provides many ways to view your loc

Quod Libet 1k Sep 16, 2021
The git repository of the advanced drum machine

Hydrogen drum machine Hydrogen is an advanced drum machine for GNU/Linux, Mac and Windows. It's main goal is to bring professional yet simple and intu

Hydrogen 702 Sep 15, 2021
A music programming language for musicians. :notes:

Installation | Docs | Changelog | Contributing composers chatting Alda is a text-based programming language for music composition. It allows you to co

Alda 4.7k Sep 15, 2021
A cross-browser javascript wrapper for the html5 audio tag

audio.js audiojs is a drop-in javascript library that allows HTML5's <audio> tag to be used anywhere. It uses native <audio> where available and falls

Anthony Kolber 2k Sep 9, 2021
🐦 A personal music streaming server that works.

koel Intro Koel (also stylized as koel, with a lowercase k) is a simple web-based personal audio streaming service written in Vue on the client side a

Koel 12.8k Sep 17, 2021
Frescobaldi LilyPond Editor

README for Frescobaldi Homepage: http://www.frescobaldi.org/ Main author: Wilbert Berendsen Frescobaldi is a LilyPond sheet music text editor. It aims

Frescobaldi 501 Sep 21, 2021
Python CD-DA ripper preferring accuracy over speed

Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star

null 554 Sep 19, 2021