Numpy 中 percentile 函数源代码的查找

在本文中，我们将介绍如何查找 Numpy 中 percentile 函数的源代码。

Python 作为一种高效的编程语言，在数据科学领域中扮演着重要的角色。Numpy 是 Python 科学计算领域中最流行的开源库之一。它提供了一个强大的数据结构，用于存储和处理大量数据，其中 percentile 函数是一个非常常用的函数。

阅读更多：Numpy 教程

Numpy Percentile 函数

Numpy 的 percentile 函数用于计算数组中给定百分位数的值。它的语法如下：

numpy.percentile(a, q, axis=None)

其中参数意义如下：

a：用于计算的数组；
q：要计算的百分位数，可以是一个数值，也可以是多个数值的列表，每个数值必须大于等于 $0$ 和小于等于 $100$ ；
axis：指定计算的轴，默认为 None。

下面是一个例子：

import numpy as np

a = np.array([[30, 40, 70], [80, 20, 10], [50, 90, 60]])

print(np.percentile(a, 50))  # 50th percentile of a
print(np.percentile(a, (25, 75)))  # 25th and 75th percentiles of a

输出结果为：

50.0
[40. 70.]

查找 Numpy Percentile 函数源代码

Numpy 的源代码是开放的，我们可以自由地查看其实现。要查找 percentile 函数的源代码，我们可以遵循以下步骤：

找到 Numpy 的源代码仓库，可以从官方网站或 Github 上找到；
导航到仓库中的 numpy 目录；
进入 numpy 目录，找到 percentile.py 文件。

下面是具体步骤：

找到 Numpy 的源代码仓库，可以从官方网站（https://numpy.org/）或 Github（https://github.com/numpy/numpy）上找到。我们以 Github 上的 Numpy 代码仓库为例；
导航到仓库中的 numpy 目录；
进入 numpy 目录，找到 percentile.py 文件。

用以下代码可以找到 percentile.py 文件：

import inspect
import numpy as np

print(inspect.getfile(np.percentile))

输出结果为：

/xxx/numpy/core/fromnumeric.py

这个结果告诉我们，Numpy 的 percentile 函数实际上在 fromnumeric.py 文件中实现。如果我们打开这个文件并查找 percentile，我们会发现以下代码块：

def percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear', keepdims=False):
    """
    Compute the q-th percentile of the data along the specified axis.

    Returns the q-th percentile(s) of the array elements.

    Parameters
    ----------
    a : array_like
        Input array or object that can be converted to an array.
    q : array_like of float
        Percentile or sequence of percentiles to compute, which must be between
        0 and 100 inclusive.
    axis : {int, tuple of int, None}, optional
        Axis or axes along which the percentiles are computed. The default is
        to compute the percentile(s) along a flattened version of the array.
    out : ndarray, optional
        Alternative output array in which to place the result. It must have
        the same shape and buffer length as the expected output, but the
        type (of the output) will be cast if necessary. See `doc.ufuncs`
        (Section "Output arguments") for more details.
    overwrite_input : bool, optional
        If True, then allow use of memory of input array `a` for calculations.
        The input array will be modified by the call to the percentile method.
        This will save memory when you do not need to preserve the contents of
        the input array. NaNs are returned for masked locations in the input
        array. If False, a copy of `a` is made.  Even when
        `overwrite_input` is False, strange numerical results may still occur
        if the input array is not afloat type.
    interpolation : {'linear', 'lower', 'higher', 'midpoint', 'nearest'}
        This optional parameter specifies the interpolation method to use,
        when the percentile lies between two data points a and b.
        * linear: `a + (b - a) * fraction`, where `fraction` is the
          fractional part of the index surrounded by `a` and `b`.
        * lower: `a`.
        * higher: `b`.
        * nearest: `a` or `b` whichever is nearest.
        * midpoint: `(a + b) / 2`.
    keepdims : bool, optional
        If this is set to True, the axes which are reduced are left
        in the result as dimensions with size one. With this option,
        the result will broadcast correctly against the original `a`.

    Returns
    -------
    percentile : ndarray
        The values, or the value, of the q-th percentile(s).

    See Also
    --------
    mean, median

    Notes
    -----
    Given a vector V of length N, the q-th percentile of V is the value q/100
    of the way from the minimum to the maximum values in V. The values of
    `percentile` can be interpreted as follows.  Suppose
    `sorted( a )` is `a.sort()`.  Then,

        - `percentile(a, 0.0)  = min(a)`
        - `percentile(a, 100)  = max(a)`
        - `percentile(a, 50.)  = median(a)`

    Examples
    --------
    >>> a = np.array([1, 2, 3, 4, 5])
    >>> np.percentile(a, 50)
    3.0
    >>> np.percentile(a, 50, interpolation='nearest')
    3
    >>> np.percentile(a, [0, 100])
    array([ 1.,  5.])
    >>> np.percentile(a, [0, 100], keepdims=True)
    array([[ 1.,  5.]])
    >>> m = np.percentile(a, 50)
    >>> out = np.zeros_like(m)
    >>> np.percentile(a, 50, out=out)
    array(3.0)
    >>> m
    3.0
    >>> b = a.reshape(5, 1)
    >>> np.percentile(b, [0, 100], axis=0)
    array([[ 1.],
           [ 5.]])
    >>> np.percentile(b, [0, 100], axis=1)
    array([[ 1.],
           [ 5.]])
    """
    return _methods._percentile(a, q, axis=axis, out=out,
                                overwrite_input=overwrite_input,
                                interpolation=interpolation,
                                keepdims=keepdims)

我们可以看到，percentile 函数实际上是一个调用 _methods._percentile 函数的包装器，同时还提供了更多的参数和功能。

总结

本文介绍了如何查找 Numpy 中 percentile 函数的源代码。我们向读者展示了如何在 Numpy 的源代码仓库中找到 fromnumeric.py 文件，并在其中找到了 percentile 函数的实现代码。对于需要深入研究 percentile 函数实现细节的开发者来说，这是一个非常有用的知识点。