Series Variable Matches Properties
Check if a Series variable matches specified properties like type, length, uniqueness, and more.
Location of the snippet: python/pandas/series/series_variable_matches_properties
This snippet is used to check if a Series variable matches specified properties like type, length, number of unique values, number of null values, and more.
| Device Type |
|---|
| Jupyter |
Variables:
| Variable Name | Variable Description | Type | Required? | Default |
|---|---|---|---|---|
series_variable_name | Name of the student's Series variable. | str | Yes | |
data_type | Expected data type of the Series. | str | No | |
length | Expected length of the Series. | int | No | |
num_unique | Expected number of unique values in the Series. | int | No | |
null_count | Expected number of null values in the Series. | int | No | |
max_value | Expected maximum value in the Series. | number | No | |
min_value | Expected minimum value in the Series. | number | No |
Examples:
1. Checking Data Type and Length of a Series
This example verifies that a student's Series has the correct data type and the expected number of elements.
Scenario: Students are processing user IDs and need to ensure the resulting user_ids_series is of string (object) type and contains 50 unique IDs.
Task:
Create a Pandas Series named user_ids_series containing 50 unique string IDs (e.g., 'user_001', 'user_002', ..., 'user_050').
Placeholder:
user_ids_series = ...
Solution:
import pandas as pd
user_ids_series = pd.Series([f'user_{i:03d}' for i in range(1, 51)])
Snippet for the assertion:
| Variable Name | Value |
|---|---|
series_variable_name | user_ids_series |
data_type | object |
length | 50 |
2. Checking Number of Unique Values and Null Count of a Series
This example ensures that a Series has no missing data, and checks the number of unique values.
Scenario: Students generate a product_categories Series after cleaning, which should have no null values and exactly 3 unique categories.
Task:
Create a Pandas Series named product_categories with three unique string values and no nulls (e.g., 'Electronics', 'Apparel', 'Books').
Placeholder:
product_categories = ...
Solution:
import pandas as pd
product_categories = pd.Series(['Electronics', 'Apparel', 'Books'])
Snippet for the assertion:
| Variable Name | Value |
|---|---|
series_variable_name | product_categories |
num_unique | 3 |
null_count | 0 |
3. Checking Value Range (Min/Max) of a Series
This example validates that numerical values within a Series fall within an expected range, useful for validating scores, measurements, or calculated metrics.
Scenario: After calculating a normalized_scores Series, all values should be between 0.0 and 1.0.
Task:
Create a Pandas Series named normalized_scores with float values between 0.0 and 1.0 (inclusive).
Placeholder:
normalized_scores = ...
Solution:
import pandas as pd
import numpy as np
normalized_scores = pd.Series([0.1, 0.5, 0.9, 0.0, 1.0])
Snippet for the assertion:
| Variable Name | Value |
|---|---|
series_variable_name | normalized_scores |
min_value | 0.0 |
max_value | 1.0 |
4. Combined Property Checks for a Series (Type, Length, Null Count, Min/Max)
This comprehensive example combines multiple property checks to ensure a Series meets several criteria simultaneously.
Scenario: A processed_ages Series, derived from raw data, is expected to be integer type, have a specific length, contain no nulls, and have ages within a realistic range.
Task:
Create a Pandas Series named processed_ages with 100 integer entries, no nulls, and all ages between 18 and 65.
Example Student's Code (in Project.ipynb):
import pandas as pd
import numpy as np
# Simulate student generating processed_ages
np.random.seed(42)
ages = np.random.randint(18, 66, size=100)
processed_ages = pd.Series(ages)
Snippet for the assertion:
| Variable Name | Value |
|---|---|
series_variable_name | processed_ages |
data_type | int |
length | 100 |
null_count | 0 |
min_value | 18 |
max_value | 65 |