Posts

Showing posts from April, 2025

Model Development Process

 Model Development Process Four Phases of ML Model Development Here are the Four Phases of ML Model Development , laid out clearly and simply — the standard flow followed in most real-world machine learning projects: 1. Problem Definition & Data Collection Goal: Understand the business or research problem and gather the right data. Key Activities: Define the objective (classification, regression, recommendation, etc.) Identify key metrics (accuracy, RMSE, precision, etc.) Collect or acquire data from relevant sources Understand data privacy, licensing, and ethics considerations Output: Well-defined problem statement, raw datasets, and clear goals. 🧹 2. Data Preparation & Exploration Goal: Clean, explore, and understand your data to prepare it for modeling. Key Activities: Handle missing values, outliers, and duplicates Normalize, encode, or transform features Feature engineering and selection Exploratory Data Analysis (EDA) — understand ...

Experiment Tracking and Versioning during Model Training

 Experiment Tracking and Versioning during Model Training Experiment Tracking   Following is just a short list of things you might want to consider tracking for each experiment during its training process:  • The loss curve corresponding to the train split and each of the eval splits.  • The model performance metrics that you care about on all nontest splits, such as accuracy, F1, perplexity. • The log of corresponding sample, prediction, and ground truth label. This comes in handy for ad hoc analytics and sanity check • The speed of your model, evaluated by the number of steps per second or, if your data is text, the number of tokens processed per second.  • System performance metrics such as memory usage and CPU/GPU utilization. They’re important to identify bottlenecks and avoid wasting system resources.  • The values over time of any parameter and hyperparameter whose changes can affect your model’s performance, such as the learning rate if you use...

Pickle Vs M, VS ONNX vs SavedModel vs TorchScript

  Pickle Vs M, VS  ONNX vs SavedModel vs TorchScript Let’s break it down in terms of purpose, use cases, compatibility, and safety : 🥒 Pickle (.pkl) Pickle is a Python-specific serialization format for objects, including machine learning models. Feature Details Use Case Serializing Python objects (including scikit-learn, XGBoost models) Compatibility Python-only (tight coupling with specific versions) Frameworks scikit-learn, XGBoost, LightGBM, etc. Speed Fast to load/save Portability ❌ Low — not portable across languages or platforms Security ⚠️ Unsafe to unpickle untrusted data (can execute arbitrary code) Deployment Typically for offline inference or Python-based pipelines ✅ Best For : Local development, internal tools, reproducible experiments ❌ Not Ideal For : Cross-platform deployment, mobile/edge/cloud scaling 🧠 TensorFlow SavedModel (MD) SavedModel is TensorFlow’s official format for storing trained models for production use. ...

Model Framework

 Model Framework Framework Description Common Use Cases Supported Formats TensorFlow Open-source ML platform by Google. Supports training & deployment. Deep learning, production ML at scale .pb , .h5 , SavedModel Keras High-level API for building and training models (now integrated with TF). Rapid prototyping, beginner-friendly .h5 , SavedModel PyTorch Flexible and widely-used for research and prototyping (by Meta). Academic research, dynamic computation .pt , TorchScript, ONNX ONNX Open format to represent ML models for interoperability across frameworks. Cross-platform model deployment .onnx scikit-learn Library for classical ML models in Python. Traditional ML (regression, classification) Pickle .pkl , ONNX (via converter) XGBoost Gradient boosting framework optimized for speed/performance. Structured/tabular data tasks Binary .model , JSON, ONNX TensorRT NVIDIA SDK for high-performance deep learning inference. Model optimization for GPU inference Supports ONNX, UFF, TF Core...

Explore the power of GitHub Actions

Image
  Explore the power of GitHub Actions   GitHub Log Building an AI Article Reviewer using GenAI Platform and GitHub Create an Azure AI Services Resource using Bicep Automate your workflow with GitHub Actions Introduction to GitHub Actions GitHub Actions Certification Learning Path Hacktoberfest Introduction to Open-Source Implement open-source software If you enjoyed the episode and want more content like it? Try our monthly developer newsletter Announcing the Microsoft AI Skills Fest: Save the date! Hi, I am Office365 admin (non-developer). How Github is helpful to us ? GitHub is not just a filing cabinet for code, you can use it for many things. I know a lot of people that use for blogging. It's pretty easy to stand up your own website from GitHub pages (for free!). Hi, I am startup founder,at ideation stage. I haven't technical background, how can you help me. Technical and Product building. Here is a learning path you might start with:  https://learn.microsoft.com/en-u...