Blog

Posts

Showing posts from April, 2025

Model Compression Vs Finetuning

- April 28, 2025

Model Compression Vs Finetuning Model Compression Techniques: Model compression techniques are strategies used to reduce the size, latency, and computational requirements of machine learning models—especially deep learning models—while preserving accuracy. These techniques are crucial for deploying models on edge devices, mobile phones, or in production environments with strict performance constraints. Common Model Compression Techniques 1. Pruning Removes unnecessary weights or neurons from the model: Weight pruning : Set small-magnitude weights to zero. Structured pruning : Remove entire filters, channels, or layers for better hardware efficiency. 2. Quantization Reduces the precision of the weights and activations: Post-training quantization : Convert a trained model (e.g., from float32 to int8). Quantization-aware training (QAT) : Train with quantization simulated during training for higher accuracy. 3. Knowledge Distillation A smaller "student" mod...

Model Development Process

- April 15, 2025

Model Development Process Four Phases of ML Model Development Here are the Four Phases of ML Model Development , laid out clearly and simply — the standard flow followed in most real-world machine learning projects: 1. Problem Definition & Data Collection Goal: Understand the business or research problem and gather the right data. Key Activities: Define the objective (classification, regression, recommendation, etc.) Identify key metrics (accuracy, RMSE, precision, etc.) Collect or acquire data from relevant sources Understand data privacy, licensing, and ethics considerations Output: Well-defined problem statement, raw datasets, and clear goals. 🧹 2. Data Preparation & Exploration Goal: Clean, explore, and understand your data to prepare it for modeling. Key Activities: Handle missing values, outliers, and duplicates Normalize, encode, or transform features Feature engineering and selection Exploratory Data Analysis (EDA) — understand ...

Experiment Tracking and Versioning during Model Training

- April 15, 2025

Experiment Tracking and Versioning during Model Training Experiment Tracking Following is just a short list of things you might want to consider tracking for each experiment during its training process: • The loss curve corresponding to the train split and each of the eval splits. • The model performance metrics that you care about on all nontest splits, such as accuracy, F1, perplexity. • The log of corresponding sample, prediction, and ground truth label. This comes in handy for ad hoc analytics and sanity check • The speed of your model, evaluated by the number of steps per second or, if your data is text, the number of tokens processed per second. • System performance metrics such as memory usage and CPU/GPU utilization. They’re important to identify bottlenecks and avoid wasting system resources. • The values over time of any parameter and hyperparameter whose changes can affect your model’s performance, such as the learning rate if you use...

Pickle Vs M, VS ONNX vs SavedModel vs TorchScript

- April 15, 2025

Pickle Vs M, VS ONNX vs SavedModel vs TorchScript Let’s break it down in terms of purpose, use cases, compatibility, and safety : 🥒 Pickle (.pkl) Pickle is a Python-specific serialization format for objects, including machine learning models. Feature Details Use Case Serializing Python objects (including scikit-learn, XGBoost models) Compatibility Python-only (tight coupling with specific versions) Frameworks scikit-learn, XGBoost, LightGBM, etc. Speed Fast to load/save Portability ❌ Low — not portable across languages or platforms Security ⚠️ Unsafe to unpickle untrusted data (can execute arbitrary code) Deployment Typically for offline inference or Python-based pipelines ✅ Best For : Local development, internal tools, reproducible experiments ❌ Not Ideal For : Cross-platform deployment, mobile/edge/cloud scaling 🧠 TensorFlow SavedModel (MD) SavedModel is TensorFlow’s official format for storing trained models for production use. ...

Model Framework

- April 15, 2025

Model Framework Framework Description Common Use Cases Supported Formats TensorFlow Open-source ML platform by Google. Supports training & deployment. Deep learning, production ML at scale .pb , .h5 , SavedModel Keras High-level API for building and training models (now integrated with TF). Rapid prototyping, beginner-friendly .h5 , SavedModel PyTorch Flexible and widely-used for research and prototyping (by Meta). Academic research, dynamic computation .pt , TorchScript, ONNX ONNX Open format to represent ML models for interoperability across frameworks. Cross-platform model deployment .onnx scikit-learn Library for classical ML models in Python. Traditional ML (regression, classification) Pickle .pkl , ONNX (via converter) XGBoost Gradient boosting framework optimized for speed/performance. Structured/tabular data tasks Binary .model , JSON, ONNX TensorRT NVIDIA SDK for high-performance deep learning inference. Model optimization for GPU inference Supports ONNX, UFF, TF Core...

Explore the power of GitHub Actions

- April 09, 2025

Explore the power of GitHub Actions GitHub Log Building an AI Article Reviewer using GenAI Platform and GitHub Create an Azure AI Services Resource using Bicep Automate your workflow with GitHub Actions Introduction to GitHub Actions GitHub Actions Certification Learning Path Hacktoberfest Introduction to Open-Source Implement open-source software If you enjoyed the episode and want more content like it? Try our monthly developer newsletter Announcing the Microsoft AI Skills Fest: Save the date! Hi, I am Office365 admin (non-developer). How Github is helpful to us ? GitHub is not just a filing cabinet for code, you can use it for many things. I know a lot of people that use for blogging. It's pretty easy to stand up your own website from GitHub pages (for free!). Hi, I am startup founder,at ideation stage. I haven't technical background, how can you help me. Technical and Product building. Here is a learning path you might start with: https://learn.microsoft.com/en-u...