Python is one of the most popular programming languages in the world, with a growing number of libraries and frameworks. Check out the latest.
#Python is one of the most popular programming languages in the world, with a growing number of libraries and frameworks to facilitate AI and ML development. There are over 250 libraries in Python, and it can be a little confusing to know which library is best for your project and keep up with the technology changes and trends that come with all of them.
Here are the popular Python machine learning libraries I have used. I've done my best to categorize them based on which scenarios they are used in. There are many libraries besides these but I can't talk about the ones I haven't used, I think these are the most used ones.
NumPy is a well-known general-purpose array processing package that is different from other machine learning packages. For n-dimensional arrays (vectors, matrices, and higher-order matrices), NumPy provides high-performance (natively compiled) support and support for a variety of operations. It supports vectorized operations and, in particular, converts Python expressions into low-level code dispatch, implicitly looping across different subsets of the data.
numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
function The start and stop parameters are both required and return values evenly distributed over a predetermined time interval.
Use the numpy.repeat(a, repeats, axis=None) method to repeat the elements of the array. The second input repeats the specified number of times.
Function numpy.random.randint(low, high=None, size=None, dtype='l') from [low, high] returns a random integer. If the high parameter does not exist (None), a random number is chosen from the range [0, low].
In short, NumPy optimized and precompiled C code can handle all the heavy lifting, making it faster than standard Python arrays.
NumPy makes many mathematical programs frequently used in scientific computing fast and easy to use.
Pandas is quickly becoming the most widely used Python data analysis library because it supports fast, adaptable, and expressive data structures for working with "relational" and "tagged" data. There are practical and real-world Python data analysis problems that require Pandas. Pandas delivers thoroughly optimized and highly reliable performance. Only C or Python are used for purely writing backend code.
The first function to mention is read_csv or read_excel. Clear explanations have been provided for these functions. I use them to read data from CSV or Excel files into pandas DataFrame format.
df = pd.read_csv("PlayerStat.csv")
.read csv() function can also read .txt files using the following syntax:
data = pd.read_csv(file.txt, sep=" ")
Boolean expression Formulas can filter or query data. I can apply the filter criteria as a string using a query function. It offers more freedom than many other programs.
df.query("A > 4")
Return only rows where A is greater than 4.
I pass the row and column index as arguments to the function, which returns the appropriate subset of the DataFrame.
Another very basic and popular feature. Before starting any analysis, visualization, or predictive modeling, you must know the data type of your variables. Using this technique you can get the data type of each column.
Vaex Python is an alternative to the Pandas library that uses Out of Core Dataframe to calculate large amounts faster data. For viewing and studying large tabular datasets, Vaex is a high-performance Python module for lazy out-of-core data frames (similar to Pandas). More than 1 billion rows can be calculated per second using simple statistics. It supports a variety of visualizations, allowing for extensive interactive data exploration.
TensorFlow is a Python library for fast numerical calculations created and published by Google. Tensorflow uses a different language and function names than Theano, which may make switching from Theano more complicated than it has to be. However, the entire computational graph in Tensorflow operates similarly to that in Theano, with the same advantages and disadvantages. Even though modifications to the computational graph have a significant impact on performance, Tensorflow's eval function only makes it slightly easier to observe intermediate states. Tensorflow is the preferred deep learning technology compared to Theano and Caffe a few years ago.
The output of this function is a tensor of the same type and shape as the input tensor but with a value of zero Tensor.
tensor = tf.constant( I[1, 2, 3], [4, 5, 6]]) tf.zeros_like( tensor) # [ [0, 0, 0], [0, 0,0]
This function may be helpful when creating a black image from the input image. If you wish to define the form directly, use tf.zeros. If you prefer initializing with 1 instead of 0, use tf.ones_like.
Increase the dimension of the tensor by adding specified padding around it with a constant value.
这可以在您运行 TensorFlow 应用程序时帮助您。使用 Eager Execution 时,您不需要在会话中构建和运行图。这是有关急切执行的更多信息。
“Eager execution”必须是导入 TensorFlow 后的第一条语句。
Torch 的 Python 实现 Pytorch 得到 Facebook 的支持。它通过提供即时图形编译与上述技术竞争,通过不将图形视为不同和不透明的对象,使 Pytorch 代码与周围的 Python 更加兼容。相反,有许多灵活的技术可以即时构建张量计算。此外,它表现良好。它具有强大的多 GPU 能力,很像 Tensorflow;然而,Tensorflow 仍然适用于更大规模的分布式系统。虽然 Pytorch 的 API 文档齐全,但 Tensorflow 或 Keras 的 API 更加完善。然而,Pytorch 在不影响性能的情况下在灵活性和可用性方面取得了胜利,这无疑迫使 Tensorflow 重新思考和调整。Tensorflow 最近受到 Pytorch 的严重挑战,
Keras 是一个开源软件库,为人工神经网络提供 Python 接口。由于 Keras 名义上是独立于引擎的,所以理论上 Keras 代码可以被重用,即使引擎需要因性能或其他因素而改变。它的缺点是,当您希望创建非常新颖或专业的架构时,通常需要在 Keras 层下使用 Tensorflow 或 Theano。这主要发生在您需要使用复杂的 NumPy 索引时,这对应于 Tensorflow 中的聚集/分散和 Theano 中的 set/inc 子张量。
在 Keras 中,evaluate() 和 predict() 都可用。这些技术可以利用 NumPy 数据集。当数据经过测试后,我完成了对结果的评估。我使用这些技术来评估我们的模型。
每个 Keras 层都包含许多技术。这些层有助于构建、配置和训练数据。密集层有助于操作实现。我使用 flat 展平了输入。Dropout 启用输入丢失。我可以使用重塑工具重塑输出。我使用输入启动了一个 Keras 张量。
您可以获得中间层的输出。
一个相当简单的库是 Keras。它使得从层的中间层获取输出成为可能。您可以轻松地向现有层添加一个新层,以帮助您在中间获得输出。
Theano 是一个 Python 库和优化编译器,用于操作和评估数学表达式,尤其是矩阵值表达式。作为最古老和最成熟的,为 Theano 提供了优势和劣势。大多数用户请求的功能都已添加,因为它是旧版本。但是,其中一些实现有点过于复杂且难以使用,因为没有先例可循。该文档是可以通过但模棱两可的。由于没有简单的方法来检查中间计算,因此在 Theano 中让复杂的项目正常运行可能非常具有挑战性。他们通常使用调试器或通过查看计算图来进行调试。
我用 dscalar 方法声明了一个十进制标量变量。当下面的语句运行时,它会在您的程序代码中添加一个名为 C 的变量。
C = tensor.dscalar()
该函数接受两个参数,第一个是输入,第二个是函数的输出。根据下面的声明,第一个参数是一个包含 C 和 D 两项的数组。结果是一个标量单位,指定为 E。
f = theano.function([C,D], E)
我见过一个高技能的 Python 程序员迅速掌握新库的精妙之处并了解如何使用它。但是,无论是初学者、中级还是专家,选择一种编程语言还是在这种情况下选择一个库而不是另一个库,很大程度上取决于您项目的目标和需求。
The above is the detailed content of Trends and comparisons of popular Python machine learning libraries. For more information, please follow other related articles on the PHP Chinese website!