Python Language
Data Science in Python:
Data science in Python involves using various libraries and tools within the Python programming language to analyze, manipulate, visualize, and interpret data.
Python has become one of the most popular programming languages for data science due to its simplicity, versatility, and a vast array of libraries specifically designed for data manipulation, analysis, and visualization. Let's break down the concept of data science in Python with a simple coding example.
We'll use the popular libraries 'NumPy', 'Pandas', and 'Matplotlib' to perform basic data analysis and visualization tasks.
# Import necessary libraries import numpy as np import pandas as pd import matplotlib.pyplot as plt # Set the style to dark background with white text plt.style.use( { "figure.facecolor": "#222", # Custom background color "axes.facecolor": "#222", # Custom background color for axes "axes.labelcolor": "white", # Text color for labels "xtick.color": "white", # Text color for x-axis ticks "ytick.color": "white", # Text color for y-axis ticks } ) # Generate some random data np.random.seed(0) num_samples = 100 x = np.random.rand(num_samples) * 10 y = 2.5 * x + np.random.randn(num_samples) * 2.5 # Create a Pandas DataFrame data = pd.DataFrame({"X": x, "Y": y}) # Display the first few rows of the DataFrame print("First few rows of the data:") print(data.head()) # Summary statistics print("\nSummary statistics:") print(data.describe()) # Scatter plot of the data plt.figure(figsize=(8, 6)) plt.scatter(data["X"], data["Y"], color="blue", label="Data Points") plt.xlabel("X", color="white") # Set text color to white plt.ylabel("Y", color="white") # Set text color to white plt.title("Scatter Plot of X vs Y", color="white") plt.legend() plt.grid(True) plt.show() # Correlation between X and Y correlation = data["X"].corr(data["Y"]) print("\nCorrelation between X and Y:", correlation) # Simple linear regression from sklearn.linear_model import LinearRegression # Prepare data for modeling X = data[["X"]] Y = data["Y"] # Create and fit the model model = LinearRegression() model.fit(X, Y) # Print the coefficients print("\nLinear Regression Coefficients:") print("Intercept:", model.intercept_) print("Coefficient:", model.coef_[0]) # Predictions predictions = model.predict(X) # Plot the regression line plt.figure(figsize=(8, 6)) plt.scatter(data["X"], data["Y"], color="blue", label="Data Points") plt.plot(data["X"], predictions, color="red", label="Regression Line") plt.xlabel("X", color="white") # Set text color to white plt.ylabel("Y", color="white") # Set text color to white plt.title("Linear Regression: X vs Y", color="white") plt.legend() plt.grid(True) plt.show()
First few rows of the data: X Y 0 5.488135 10.807463 1 7.151894 20.131800 2 6.027634 16.233241 3 5.448832 9.781470 4 4.236548 14.312000 Summary statistics: X Y count 100.000000 100.000000 mean 4.727938 12.300682 std 2.897540 7.620958 min 0.046955 -1.513415 25% 2.058032 5.382785 50% 4.674810 12.488958 75% 6.844833 17.936766 max 9.883738 29.173335 Correlation between X and Y: 0.9445225692562866 Linear Regression Coefficients: Intercept: 0.5553776936180661 Coefficient: 2.4842337553505103
• Explanation:
1. Import 'NumPy' for numerical operations, 'Pandas' for data manipulation, and 'Matplotlib' for data visualization.
2. Generate some random data points representing a linear relationship between X and Y variables.
3. Create a Pandas DataFrame to organize our data.
4. Display the first few rows of the DataFrame and its summary statistics.
5. Create a scatter plot to visualize the relationship between X and Y.
6. Calculate the correlation coefficient between X and Y.
7. Perform simple linear regression using the LinearRegression model from scikit-learn.
8. Print the coefficients of the regression model and make predictions.
9. Finally, Plot the regression line along with the original data points to visualize the linear relationship.
Note: Data science in Python encompasses a wide range of tasks, including data manipulation, analysis, visualization, and modeling.
What's Next?
We've now entered the finance section on this platform, where you can enhance your financial literacy.