Skip to main content

Series Variable Matches Properties

Check if a Series variable matches specified properties like type, length, uniqueness, and more.

Location of the snippet: python/pandas/series/series_variable_matches_properties

This snippet is used to check if a Series variable matches specified properties like type, length, number of unique values, number of null values, and more.

Device Type
Jupyter

Variables:

Variable NameVariable DescriptionTypeRequired?Default
series_variable_nameName of the student's Series variable.strYes
data_typeExpected data type of the Series.strNo
lengthExpected length of the Series.intNo
num_uniqueExpected number of unique values in the Series.intNo
null_countExpected number of null values in the Series.intNo
max_valueExpected maximum value in the Series.numberNo
min_valueExpected minimum value in the Series.numberNo

Examples:

1. Checking Data Type and Length of a Series

This example verifies that a student's Series has the correct data type and the expected number of elements.

Scenario: Students are processing user IDs and need to ensure the resulting user_ids_series is of string (object) type and contains 50 unique IDs.

Task: Create a Pandas Series named user_ids_series containing 50 unique string IDs (e.g., 'user_001', 'user_002', ..., 'user_050').

Placeholder:

user_ids_series = ...

Solution:

import pandas as pd
user_ids_series = pd.Series([f'user_{i:03d}' for i in range(1, 51)])

Snippet for the assertion:

Variable NameValue
series_variable_nameuser_ids_series
data_typeobject
length50

2. Checking Number of Unique Values and Null Count of a Series

This example ensures that a Series has no missing data, and checks the number of unique values.

Scenario: Students generate a product_categories Series after cleaning, which should have no null values and exactly 3 unique categories.

Task: Create a Pandas Series named product_categories with three unique string values and no nulls (e.g., 'Electronics', 'Apparel', 'Books').

Placeholder:

product_categories = ...

Solution:

import pandas as pd
product_categories = pd.Series(['Electronics', 'Apparel', 'Books'])

Snippet for the assertion:

Variable NameValue
series_variable_nameproduct_categories
num_unique3
null_count0

3. Checking Value Range (Min/Max) of a Series

This example validates that numerical values within a Series fall within an expected range, useful for validating scores, measurements, or calculated metrics.

Scenario: After calculating a normalized_scores Series, all values should be between 0.0 and 1.0.

Task: Create a Pandas Series named normalized_scores with float values between 0.0 and 1.0 (inclusive).

Placeholder:

normalized_scores = ...

Solution:

import pandas as pd
import numpy as np
normalized_scores = pd.Series([0.1, 0.5, 0.9, 0.0, 1.0])

Snippet for the assertion:

Variable NameValue
series_variable_namenormalized_scores
min_value0.0
max_value1.0

4. Combined Property Checks for a Series (Type, Length, Null Count, Min/Max)

This comprehensive example combines multiple property checks to ensure a Series meets several criteria simultaneously.

Scenario: A processed_ages Series, derived from raw data, is expected to be integer type, have a specific length, contain no nulls, and have ages within a realistic range.

Task: Create a Pandas Series named processed_ages with 100 integer entries, no nulls, and all ages between 18 and 65.

Example Student's Code (in Project.ipynb):

import pandas as pd
import numpy as np
# Simulate student generating processed_ages
np.random.seed(42)
ages = np.random.randint(18, 66, size=100)
processed_ages = pd.Series(ages)

Snippet for the assertion:

Variable NameValue
series_variable_nameprocessed_ages
data_typeint
length100
null_count0
min_value18
max_value65