Python NumPy Tutorial for Beginners

⚡ Smart Summary

NumPy is the foundational Python library for numerical computing, powering arrays, matrices, linear algebra, and random sampling. This NumPy tutorial walks through installation, array creation, reshaping, slicing, statistics, and matrix math with runnable Python examples.

  • Core Use: Fast N-dimensional arrays and vectorized math in Python.
  • ⚙️ Key Functions: zeros, ones, reshape, hstack, dot, linspace.
  • 📚 Foundations: Linear algebra, broadcasting, statistics, slicing.
  • 🚀 Current Version: NumPy 2.x (released June 2024) is now stable.
  • 🤖 AI Use: Backbone of PyTorch, TensorFlow, JAX, and CuPy tensors.

Python NumPy Tutorial

What is NumPy in Python?

NumPy is an open source Python library for mathematical, scientific, engineering, and data science programming. It is a very useful library for mathematical and statistical operations in Python. It works perfectly with multi-dimensional arrays and matrix multiplication, and integrates easily with C/C++ and Fortran.

For any scientific project, NumPy is the tool to know. It is built for the N-dimensional array, linear algebra, random number generation, Fourier transforms, and more.

NumPy gives Python first-class support for multi-dimensional arrays and matrices, plus a large set of mathematical operations that act on them. In this guide, we will review the essential functions you need to know before moving on to the tutorial on ‘TensorFlow.’

Why use NumPy?

NumPy is memory efficient, so it can handle large volumes of numerical data more easily than pure Python lists. It is also very convenient for matrix multiplication and reshaping, and it is fast because the heavy work is delegated to compiled C routines. In fact, libraries such as TensorFlow, PyTorch, and Scikit-learn rely on NumPy arrays for matrix math in their back end, which makes NumPy the silent workhorse beneath most modern data science and AI workflows.

How to Install NumPy

To install the NumPy library, refer to our tutorial How to install TensorFlow. NumPy is installed by default with Anaconda.

In the rare case that NumPy is not installed, use any of the options below.

You can install NumPy using Anaconda:

conda install -c anaconda numpy
  • In Jupyter Notebook:
import sys
!conda install --yes --prefix {sys.prefix} numpy

Pip users can install or upgrade with a single command:

pip install --upgrade numpy

Import NumPy and Check Version

The command to import numpy is:

import numpy as np

The line above renames the NumPy namespace to np. This shortcut lets you prefix NumPy functions, methods, and attributes with np. instead of typing numpy., and it is the standard convention you will find across the NumPy ecosystem.

To check your installed version of NumPy, use the command below:

print(np.__version__)

Output:

2.1.3

NumPy 2.0 shipped in June 2024 and is the current stable major series. If your project still depends on the 1.x API, pin to numpy<2 in your requirements file because some legacy aliases such as np.float were removed in 2.0.

What is a Python NumPy Array?

NumPy arrays look a bit like Python lists at first glance, but they behave very differently. For readers new to the topic, let us clarify what an ndarray is and why it matters.

As the name suggests, a NumPy array is the central data structure of the NumPy library. The name itself is short for “Numeric Python” or “Numerical Python”, and the ndarray is a homogeneous, fixed-size block of memory with a known shape and data type.

Creating a NumPy Array

The simplest way to create an array in NumPy is to start from a Python List.

myPythonList = [1,9,8,3]

Convert the Python list to a NumPy array using np.array.

numpy_array_from_list = np.array(myPythonList)

Display the contents of the array.

numpy_array_from_list

Output:

array([1, 9, 8, 3])

In practice, there is no need to declare a Python list first. The two steps can be combined.

a = np.array([1,9,8,3])

NOTE: The NumPy documentation also mentions np.ndarray, but np.array is the recommended factory function for everyday use.

You can also create a NumPy array from a tuple in the same way.

Mathematical Operations on an Array

You can perform mathematical operations such as addition, subtraction, division, and multiplication on an array. The syntax is the array name, followed by the operator (+, -, *, /), followed by the operand. These operations are vectorized, which means NumPy applies them to every element without an explicit Python loop.

Example:

numpy_array_from_list + 10

Output:

array([11, 19, 18, 13])

This operation adds 10 to each element of the NumPy array.

Shape of Array

You can check the shape of an array with the shape attribute, accessed by appending it to the array name. In the same way, you can check the type with dtype.

import numpy as np
a = np.array([1,2,3])
print(a.shape)
print(a.dtype)

(3,)
int64

An integer is a value without a decimal point. If you create an array with decimals, the dtype changes to float.

#### Different type
b = np.array([1.1,2.0,3.2])
print(b.dtype)

float64

2 Dimension Array

You can add a dimension with a comma inside the parentheses.

Note that the inner tuples have to sit inside the outer brackets [].

### 2 dimension
c = np.array([(1,2,3),
              (4,5,6)])
print(c.shape)
(2, 3)

3 Dimension Array

Higher dimensions can be constructed in the same way.

### 3 dimension
d = np.array([
    [[1, 2,3],
        [4, 5, 6]],
    [[7, 8,9],
        [10, 11, 12]]
])
print(d.shape)
(2, 2, 3)
Objective Code
Create array array([1,2,3])
print the shape array([.]).shape

What is numpy.zeros()?

numpy.zeros() or np.zeros is a Python function used to create a matrix filled with zeros. numpy.zeros() is useful when you need to initialise weights during the first iteration in TensorFlow or set up placeholder buffers for other statistical tasks.

numpy.zeros() function Syntax

numpy.zeros(shape, dtype=float, order='C')

Python numpy.zeros() Parameters

Here,

  • Shape: shape of the NumPy zero array.
  • Dtype: data type of the elements. It is optional, and the default value is float64.
  • Order: default is C, which is the row-major layout used by numpy.zeros() in Python.

Python numpy.zeros() Example

import numpy as np
np.zeros((2,2))

Output:

array([[0., 0.],
          [0., 0.]])

Example of numpy zero with Datatype

import numpy as np
np.zeros((2,2), dtype=np.int16)

Output:

array([[0, 0],
         [0, 0]], dtype=int16)

What is numpy.ones()?

np.ones() function creates a matrix filled with ones. numpy.ones() in Python is useful when you initialise weights during the first iteration in TensorFlow and for other statistical tasks.

Python numpy.ones() Syntax

numpy.ones(shape, dtype=float, order='C')

Python numpy.ones() Parameters

Here,

  • Shape: shape of the np.ones Python Array.
  • Dtype: data type of the elements. It is optional, and the default value is float64.
  • Order: default is C, which is the row-major layout.

Python numpy.ones() 2D Array with Datatype Example

import numpy as np
np.ones((1,2,3), dtype=np.int16)

Output:

array([[[1, 1, 1],
       [1, 1, 1]]], dtype=int16)

numpy.reshape() function in Python

Python NumPy Reshape changes the shape of an array without changing its data. You may need to reshape data from wide to long format, or move from a 1D vector to a 2D matrix before feeding it to a model. The np.reshape function handles this in one call.

Syntax of np.reshape()

numpy.reshape(a, newShape, order='C')

Here,

a: array that you want to reshape.

newShape: the new desired shape.

Order: default is C, the row-major layout.

Example of NumPy Reshape

import numpy as np
e = np.array([(1,2,3), (4,5,6)])
print(e)
e.reshape(3,2)

Output:

 // Before reshape
[[1 2 3]
 [4 5 6]]
//After Reshape
array([[1, 2],
	[3, 4],
	[5, 6]])

numpy.flatten() in Python

Python NumPy Flatten returns a copy of the array collapsed into one dimension. When you work with neural networks such as convnets, you often need to flatten an image tensor before passing it to a dense layer. The np.flatten() function handles this in a single call.

Syntax of np.flatten()

numpy.flatten(order='C')

Here,
Order: default is C, the row-major layout.

Example of NumPy Flatten

e.flatten()

Output:

array([1, 2, 3, 4, 5, 6])

What is numpy.hstack() in Python?

numpy.hstack is a Python function used to horizontally stack sequences of input arrays into a single array. With hstack(), you append data along the column axis. It is a very convenient helper in NumPy when you need to merge feature vectors side by side.

Let us study hstack in Python with an example.

Example:

## Horizontal Stack
import numpy as np
f = np.array([1,2,3])
g = np.array([4,5,6])
print('Horizontal Append:', np.hstack((f, g)))

Output:

Horizontal Append: [1 2 3 4 5 6]

What is numpy.vstack() in Python?

numpy.vstack is a Python function used to vertically stack sequences of input arrays into a single array. With vstack(), you append data along the row axis, which is handy when you want to combine batches of samples.

Let us study it with an example.

Example:

## Vertical Stack
import numpy as np
f = np.array([1,2,3])
g = np.array([4,5,6])
print('Vertical Append:', np.vstack((f, g)))

Output:

Vertical Append: [[1 2 3]
 [4 5 6]]

After studying NumPy vstack and hstack, let us look at an example that generates random numbers in NumPy.

Generate Random Numbers using NumPy

To generate random numbers from a Gaussian distribution, use:

numpy.random.normal(loc, scale, size)

Here,

  • Loc: the mean. The centre of the distribution.
  • Scale: the standard deviation.
  • Size: number of returned samples.

Example:

## Generate random numbers from a normal distribution
normal_array = np.random.normal(5, 0.5, 10)
print(normal_array)
[5.56171852 4.84233558 4.65392767 4.946659   4.85165567 5.61211317 4.46704244 5.22675736 4.49888936 4.68731125]

If plotted, the distribution is similar to the following plot.

Example to Generate Random Numbers using NumPy
Example to Generate Random Numbers using NumPy

Note that newer code increasingly uses the modern generator API, np.random.default_rng(), which is the recommended replacement for the legacy np.random.normal calls in NumPy 2.x.

NumPy Asarray Function

The asarray() function converts an input to an array. The input can be a list, tuple, ndarray, or similar sequence.

Syntax:

numpy.asarray(data, dtype=None, order=None)

Here,

data: data that you want to convert to an array.

dtype: optional. If not specified, the data type is inferred from the input data.

Order: default is C, the row-major layout. The other option is F (Fortran-style column-major).

Example:

Consider the following 2D matrix with four rows and four columns, filled with 1.

import numpy as np
A = np.matrix(np.ones((4,4)))

If you try to change a value in the matrix through np.array, the original is not modified. The reason is that np.array creates a copy of the matrix.

np.array(A)[2]=2
print(A)
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]

The matrix is unchanged. Use asarray when you want to modify the original array in place. Let us see what happens when you set the third row to the value 2.

np.asarray(A)[2]=2
print(A)

Code Explanation:

np.asarray(A) converts the matrix A to an array view.

[2] selects the third row.

Output:

[[1. 1. 1. 1.]
      [1. 1. 1. 1.]
      [2. 2. 2. 2.] # new value
      [1. 1. 1. 1.]]

What is numpy.arange()?

numpy.arange() is a built-in NumPy function that returns an ndarray containing evenly spaced values within a defined interval. For example, to create values from 1 to 10, you can call np.arange() in Python.

Syntax:

numpy.arange(start, stop, step, dtype)

Python NumPy arange Parameters:

  • Start: start of the interval for np.arange in Python.
  • Stop: end of the interval.
  • Step: spacing between values. Default step is 1.
  • Dtype: type of the output array for NumPy arange.

Example:

import numpy as np
np.arange(1, 11)

Output:

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

Example:

To change the step in this NumPy arange example, add a third number to the parentheses. It changes the step size.

import numpy as np
np.arange(1, 14, 4)

Output:

array([ 1,  5,  9, 13])

NumPy Linspace Function

linspace returns evenly spaced samples between two endpoints.

Syntax:

numpy.linspace(start, stop, num, endpoint)

Here,

  • Start: starting value of the sequence.
  • Stop: end value of the sequence.
  • Num: number of samples to generate. Default is 50.
  • Endpoint: if True (default), stop is the last value. If False, stop is not included.

Example:

For instance, it can be used to create 10 evenly spaced values from 1 to 5.

import numpy as np
np.linspace(1.0, 5.0, num=10)

Output:

array([1.        , 1.44444444, 1.88888889, 2.33333333, 2.77777778,       3.22222222, 3.66666667, 4.11111111, 4.55555556, 5.        ])

If you do not want to include the last digit in the interval, set endpoint to False.

np.linspace(1.0, 5.0, num=5, endpoint=False)

Output:

array([1. , 1.8, 2.6, 3.4, 4.2])

LogSpace NumPy Function in Python

logspace returns evenly spaced numbers on a logarithmic scale. logspace shares the same parameters as np.linspace.

Syntax:

numpy.logspace(start, stop, num, endpoint)

Example:

np.logspace(3.0, 4.0, num=4)

Output:

array([ 1000. ,  2154.43469003,  4641.58883361, 10000.        ])

Finally, if you want to check the memory size of a single element in an array, you can use itemsize.

x = np.array([1,2,3], dtype=np.complex128)
x.itemsize

Output:

16

Each element takes 16 bytes.

Indexing and Slicing in Python

Slicing data is straightforward with NumPy. We will slice the matrix e. In Python, you use square brackets to return rows or columns.
Example:

## Slice
import numpy as np
e = np.array([(1,2,3), (4,5,6)])
print(e)
[[1 2 3]
 [4 5 6]]

Remember that in NumPy the first array index starts at 0.

## First row
print('First row:', e[0])

## Second row
print('Second row:', e[1])

Output:

First row: [1 2 3]
Second row: [4 5 6]

In Python, as in many other languages,

  • The value before the comma stands for the rows.
  • The value after the comma stands for the columns.
  • If you want to select a column, add : before the column index.
  • : means you want all the rows from the selected column.
print('Second column:', e[:,1])
Second column: [2 5]

To return the first two values of the second row, use : to select all columns up to the second index.

## Second Row, two values
print(e[1, :2])
[4 5]

Statistical Functions in Python

NumPy provides a useful set of statistical functions for finding the minimum, maximum, percentile, standard deviation, variance, and more from a given array. The most common ones are listed below.

NumPy ships with robust statistical helpers, summarised in the table.

Function Numpy
Min np.min()
Max np.max()
Mean np.mean()
Median np.median()
Standard deviation np.std()

Consider the following array:

Example:

import numpy as np
normal_array = np.random.normal(5, 0.5, 10)
print(normal_array)

Output:

[5.56171852 4.84233558 4.65392767 4.946659   4.85165567 5.61211317 4.46704244 5.22675736 4.49888936 4.68731125]

Example of NumPy Statistical functions

### Min
print(np.min(normal_array))

### Max
print(np.max(normal_array))

### Mean
print(np.mean(normal_array))

### Median
print(np.median(normal_array))

### Sd
print(np.std(normal_array))

Output:

4.467042435266913
5.612113171990201
4.934841002270593
4.846995625786663
0.3875019367395316

What is numpy dot product?

numpy.dot is a powerful function for matrix computation. For example, you can compute the dot product with np.dot. numpy.dot returns the dot product of a and b, and handles both 1D and 2D arrays. For 2D inputs it performs matrix multiplication.

Syntax:

numpy.dot(x, y, out=None)

Parameters

Here,

x, y: input arrays. x and y should both be 1D or 2D for np.dot() to work as a dot product or matrix multiply.

out: optional output argument used to capture the result. Otherwise, an ndarray is returned.

Returns

The function numpy.dot() returns the dot product of two arrays x and y. It returns a scalar when both are 1D, otherwise it returns an array. If out is given, that array is returned.

Raises

The dot product in Python raises a ValueError if the last dimension of x does not match the second to last dimension of y.

Example:

## Linear algebra
### Dot product: product of two arrays
f = np.array([1,2])
g = np.array([4,5])
### 1*4+2*5
np.dot(f, g)

Output:

14

Matrix Multiplication in Python

The NumPy matmul() function returns the matrix product of two arrays. Here is how it works.

1) For 2D arrays, it returns the normal matrix product.

2) For dimensions greater than 2, the product is treated as a stack of matrices.

3) A 1D array is first promoted to a matrix, then the product is calculated.

The infix @ operator, available since Python 3.5, is shorthand for matmul and is widely used in modern code.

Syntax:

numpy.matmul(x, y, out=None)

Here,

x, y: input arrays. Scalars are not allowed.

out: optional parameter. Usually the output is stored in an ndarray.

Example:

In the same way, you can compute matrix multiplication with np.matmul.

### Matmul: matrix product of two arrays
h = [[1,2],[3,4]]
i = [[5,6],[7,8]]
### 1*5+2*7 = 19
np.matmul(h, i)

Output:

array([[19, 22],
            [43, 50]])

Determinant

If you need to compute the determinant of a matrix, use np.linalg.det(). NumPy takes care of the dimensions for you.

Example:

## Determinant 2*2 matrix
### 5*8-7*6
np.linalg.det(i)

Output:

-2.000000000000005

NumPy Broadcasting Explained

Broadcasting is the rule that lets NumPy perform arithmetic on arrays of different shapes without copying data. When you add a scalar to a 1D vector, or a 1D vector to a 2D matrix, NumPy stretches the smaller operand along the missing axes so the shapes line up. This avoids explicit Python loops and is one of the main reasons NumPy code feels fast and concise.

Two shapes are compatible for broadcasting when, compared from right to left, each pair of dimensions is either equal or one of them equals 1. For instance, a matrix of shape (3, 4) broadcasts against a row of shape (4,) or a column of shape (3, 1). The same logic powers normalisation in machine learning, where you subtract a per-feature mean vector from a batch of samples in a single line. Broadcasting is also the conceptual model used by tensor libraries such as PyTorch, TensorFlow, and JAX, so the rules you learn here transfer directly to GPU code.

How NumPy Powers AI and Machine Learning

NumPy sits underneath almost every modern Python AI stack. PyTorch tensors, TensorFlow eager tensors, and JAX arrays expose the same broadcasting, slicing, and reshaping API that you learned on NumPy ndarrays. Many tutorials still feed pre-processed NumPy arrays into model training loops, and frameworks accept them with zero conversion.

Three concrete patterns show how NumPy connects to AI work today. First, data engineers shape raw features into NumPy arrays before handing them to scikit-learn pipelines or to Hugging Face datasets. Second, researchers use NumPy as a CPU baseline and then swap in CuPy, which provides a near-identical API but runs on NVIDIA GPUs through CUDA. Third, JAX builds on the NumPy API to add automatic differentiation and just-in-time compilation, so the same code that runs on the laptop can run on TPUs in the cloud. The takeaway is that the array skills you build with NumPy carry into every major deep learning toolkit, which makes this tutorial a useful step on the road to AI development.

FAQs

NumPy provides fast N-dimensional arrays, vectorized math, linear algebra, random sampling, and Fourier transforms. Engineers and data scientists use it for numerical computing, image and signal processing, statistics, and as the array backbone for higher-level libraries such as pandas, scikit-learn, and SciPy.

A NumPy array stores elements of one fixed data type in a contiguous block of memory, which makes vectorized math fast. Python lists hold arbitrary objects, support mixed types, and rely on slower per-element loops, so they are flexible but far less efficient for large numerical workloads.

NumPy 2.0, released in June 2024, keeps the core array API stable but removes long-deprecated aliases such as np.float and np.int, and tightens type promotion rules. Most production code needs only minor edits. Pin numpy<2 if a dependency is not yet updated.

PyTorch and TensorFlow model their tensor objects on NumPy ndarrays. They share the same shape, broadcasting, and slicing semantics, and both accept NumPy arrays as input. Skills learned with NumPy transfer directly to building neural networks, training loops, and tensor pre-processing pipelines.

Stock NumPy runs on the CPU. For GPU acceleration, use CuPy, which mirrors the NumPy API on NVIDIA CUDA devices, or JAX, which adds autodiff and just-in-time compilation for GPUs and TPUs. Both let you reuse most existing NumPy code with minimal rewrites for AI workloads.

Anaconda ships NumPy by default. Otherwise, run pip install numpy or conda install -c anaconda numpy. To upgrade an existing install to NumPy 2.x, run pip install --upgrade numpy. Check the active version with print(np.__version__) inside Python.

Broadcasting lets NumPy operate on arrays of different shapes without copying data. The smaller array is virtually stretched to match the larger one when the dimensions are compatible. It powers concise expressions for normalisation, scaling, and feature engineering in machine learning pipelines.

Summarize this post with: