Your cart is empty
Add prompt packs to continue
This definitive collection of AI prompts for Data Science has been specifically designed to transform professionals and students into high-performing experts. Through a meticulous structure, this library covers everything from technical data manipulation to strategic communication of findings, allowing you to automate complex workflows and increase the accuracy of your predictive models in record time. By integrating these prompts into your workflow, you will gain an immediate competitive advantage in the job market. Each instruction is optimized to generate clean code, rigorous statistical analysis, and impactful visualizations, ensuring that every stage of your data pipeline meets the most demanding standards in today's technology industry.
100 resources included
Acts as a Senior Python Developer and Data Architect specialized in optimizing Interactive Dashboards with Streamlit. Your mission is to analyze a data analysis application that has latency problems and transform it into a high-performance tool. To do this, you will need to expertly apply Streamlit's caching mechanisms, clearly differentiating when to use @st.cache_data for serializable data persistence and @st.cache_resource for global objects that should not be reloaded, such as AI models or database connections in [Data_Source]. Carefully analyze the workflow described in [Code_or_Process] and identify critical points where execution slows down due to repetitive operations on a volume of [Dataset_Size]. You should propose a code restructure that implements advanced cache decorators, setting specific parameters such as 'ttl' set to [TTL_Seconds] to ensure data freshness, 'max_entries' for memory control, and 'show_spinner' to improve the end-user's perception of speed in the dynamic display environment. Provides a complete technical solution that includes: 1. An optimized data loading function that handles exceptions and cleans data before caching. 2. A heavy processing function that uses vectorization techniques if possible. 3. The integration of these elements into a Streamlit layout that uses containers and columns for a professional presentation. Be sure to explain the logic behind each caching decision, especially why one method was chosen over another based on the nature of the objects [Object_Type]. Finally, it generates a production-ready code block that serves as a high-performance template for Data Science projects. The code must include detailed comments, follow PEP 8 style guides, and demonstrate how cache invalidation can be handled programmatically to prevent the user from displaying stale information. The goal is for interaction with dashboard filters to be instantaneous, regardless of the complexity of the underlying calculations.
He acts as a Senior Data Scientist with extensive experience in Marketing Analytics and unsupervised learning models. The main objective of this task is to design and implement an advanced customer segmentation system for the [Industry or Business] sector, focused exclusively on transactional behavior patterns and digital interaction. To begin, the analysis must contemplate rigorous preprocessing of the raw data hosted in [Data Source]. It is imperative to perform an Exploratory Data Analysis (EDA) that identifies the correlation between variables such as the frequency of use of the platform, the total accumulated expenditure, the time spent per session and [Additional Variable 1]. You will need to propose feature engineering techniques to transform temporal data into actionable variables such as 'potential churn rate' or 'purchase seasonality'. Subsequently, it develops a technical comparison between the K-Means, DBSCAN and Hierarchical Clustering algorithms. For K-Means, it integrates Elbow Method and Silhouette Score visualizations to scientifically validate the optimal number of clusters (k). In the case of DBSCAN, justify the choice of the 'epsilon' and 'min_samples' parameters based on the density of the data and the presence of noise or outliers that must be excluded from the main segments. Once the groups are defined, it generates an exhaustive description of each identified segment. Don't limit yourself to statistics; translates data into 'Customer Archetypes'. For example, define groups such as 'Champions', 'At Risk Customers' or 'New Enthusiasts'. For each segment, detail: 1) Key behavior, 2) Projected economic value, 3) Recommended retention strategy and 4) Preferred communication channels based on their previous interactions. Finally, it delivers the complete and documented code in Python, using cutting-edge libraries such as Scikit-Learn for modeling, Pandas for manipulation and Matplotlib/Seaborn or Plotly for the spatial visualization of the clusters. The code should include an 'Importance of Characteristics' section to understand which variables had the most influence on the separation of each group.
He acts as an expert consultant in Data Visualization and UI/UX Design specialized in digital accessibility and WCAG standards. Your objective is to design a custom color architecture for a technical report for [Project or Industry Name] using exclusively the Python Matplotlib library. The palette should be aesthetically sophisticated and professional, but its absolute priority is to ensure inclusivity, meeting AA or AAA contrast levels and being perfectly readable for users with color vision deficiencies such as Protanopia, Deuteranopia and Tritanopia. Generate a robust and modular Python script that defines a color sequence in HEX format. The scheme should be composed of a high-impact primary color to highlight critical findings ([Suggested Primary Color]), a set of [Number of Categories] balanced secondary colors for multivariate comparisons, and a range of technical grays for supporting elements such as axes, labels, and grids. Use `matplotlib.colors.ListedColormap` to register this palette in the Matplotlib system, allowing its global use through `plt.rcParams`. The script must include a visual validation function that generates a test dashboard with three subplots: 1) A clustered bar chart demonstrating visual separation between contiguous categories; 2) A line chart with distinct markers to ensure redundancy of information beyond nuance; and 3) A heatmap that validates the perceptual uniformity of the luminosity scale. Each visualization should apply minimalist design principles, eliminating visual noise or 'chartjunk'. Finally, it provides a detailed technical justification for each color choice, explaining the contrast ratio with respect to a colored background [Background Color: White/Light Grey/Dark]. The code must be prepared to integrate into a professional Data Science workflow within [Name of Repository or Work Environment], including comprehensive comments explaining how changes in saturation and Value improve readability under conditions of eye strain or black and white printing.