Chapter 12 Assignment - Data Analysis

Test your knowledge of data analysis using NumPy, Pandas, DuckDB, and SQL integration. Questions are categorized into three levels: Basic, Intermediate, and Advanced.

Basic Questions

  1. Create a NumPy array of 10 zeros.
  2. Convert a Python list [1,2,3,4,5] to a NumPy array.
  3. Use NumPy to generate an array of even numbers from 2 to 20.
  4. Create a Pandas DataFrame from a dictionary of lists.
  5. Read a CSV file into a Pandas DataFrame.
  6. Display the first 5 rows of a DataFrame using head().
  7. Find the shape of a NumPy array.
  8. Sort a DataFrame based on a column.
  9. Filter rows where column value is greater than 50.
  10. Execute a simple SELECT query using DuckDB or sqlite3.

Intermediate Questions

  1. Find the mean, median, and standard deviation of a NumPy array.
  2. Merge two DataFrames using a common column.
  3. Use groupby() in Pandas to get average values by group.
  4. Use DuckDB to run a SQL query on a Pandas DataFrame.
  5. Drop missing values from a DataFrame.
  6. Plot a bar chart using Pandas built-in plotting.
  7. Reshape a NumPy array using reshape().
  8. Use Pandas to read an Excel file.
  9. Apply a lambda function to a column in a DataFrame.
  10. Write data from a DataFrame into a new SQL table using Pandas.

Advanced Questions

  1. Use NumPy broadcasting to add two arrays of different shapes.
  2. Implement a sliding window mean using NumPy.
  3. Use Pandas pivot_table for advanced data aggregation.
  4. Run JOIN queries using DuckDB on two CSV files.
  5. Optimize performance by selecting specific columns while reading a large file in Pandas.
  6. Perform data transformation using SQLAlchemy ORM.
  7. Build a pipeline that reads a CSV, processes it with Pandas, and writes results to SQL.
  8. Handle time series data with DatetimeIndex in Pandas.
  9. Use NumPy's advanced indexing to modify specific elements.
  10. Create a custom Pandas accessor for reusable data operations.