Tuesday, June 23, 2026

Merged timeline of 267 items — blog publish times and listing timestamps, cut at midnight . Page 1 of 6.

  1. LLMMistral AI
    Mistral OCR 4

    Mistral OCR 4 extracts and structures content from documents, featuring bounding boxes, block classification, and inline confidence scores in 170 languages. It excels in multilingual document processing and is designed…

    by Yash @ Explainx0 comments
  2. LLMBaidu Inc.
    Unlimited OCR Works

    Unlimited OCR is designed for one-shot long-horizon parsing of documents. It enhances the capabilities of previous OCR models, enabling efficient document processing.

    by Yash @ Explainx0 comments
  3. Skilldesign
    user-experience

    Apply UX thinking to improve product decisions and user flows.

    by Yash @ Explainx0 comments
  4. Skilltao
    tao-train-pose-classification

    Pose classification using ST-GCN (Spatial Temporal Graph Convolutional Network). Classifies skeleton sequences

    by Yash @ Explainx0 comments
  5. Skilltilegym
    tilegym-adding-cutile-kernel

    Add a new cuTile GPU kernel operator to TileGym. Covers dispatch registration in ops.py, cuTile backend implementation, __init__.py exports, test creation, and benchmark in tests/benchmark. Use when adding, creating, or…

    by Yash @ Explainx0 comments
  6. Skilltilegym
    tilegym-converting-cutile-to-julia

    Converts cuTile Python GPU kernels (@ct.kernel) to cuTile.jl Julia equivalents. Handles kernel syntax translation, 0-indexed to 1-indexed conversion, broadcasting differences, memory layout (row-major to column-major),…

    by Yash @ Explainx0 comments
  7. Skilltao
    tao-train-sparse4d

    Sparse4D for multi-camera temporal 3D object detection and tracking. Uses sparse queries with deformable

    by Yash @ Explainx0 comments
  8. Skilltao
    tao-validate-dataset-format

    Run `tao-daft validate` to check NVIDIA TAO DAFT datasets for structure, schema, and cross-reference errors. Do

    by Yash @ Explainx0 comments
  9. Skilltao
    tao-train-single-step

    Standard single-step train/eval/export workflow for any TAO model. Use when training a TAO model on a dataset

    by Yash @ Explainx0 comments
  10. Skilltilegym
    tilegym-cutile-python

    Expert cuTile programming assistant. Write high-performance GPU kernels using cuTile's tile-based programming model with proper validation and optimization. Supports deep agent orchestration for complex multi-kernel tas…

    by Yash @ Explainx0 comments
  11. Skillvss
    vss-summarize-video

    Use to summarize a recorded video via the LVS summarization microservice (HITL-gated) with a VLM fallback. Not for report generation or live RTSP captioning.

    by Yash @ Explainx0 comments
  12. Skilltilegym
    tilegym-cutile-autotuning

    Use when adding, modifying, optimizing, or debugging CuTile autotuning code. Trigger signals: `exhaustive_search` / `replace_hints` / `hints_fn` / `cuda.tile.tune` in code, `autotune` in filenames, or correctness/perfor…

    by Yash @ Explainx0 comments
  13. Skillvss
    vss-setup-behavior-analytics

    Use to deploy the vss-behavior-analytics service standalone (entrypoint, config-source, optional calibration). Not for the full warehouse deploy.

    by Yash @ Explainx0 comments
  14. Skillvss
    vss-setup-video-analytics-api

    Use to deploy the vss-video-analytics-api REST service standalone (config-source, data-log bind, Elasticsearch, optional Kafka). Not for full warehouse deploy.

    by Yash @ Explainx0 comments
  15. Skillvss
    vss-search-archive

    Use this skill to run top-level VSS fusion search on archived video, or to ingest video files / RTSP streams for search. Do NOT use for ad-hoc visual Q&A (use vss-ask-video), live captioning (use vss-deploy-dense-captio…

    by Yash @ Explainx0 comments
  16. Skillvss
    vss-query-analytics

    Use this skill when reading video-analytics metrics, incidents, alerts, and sensor data via the VA-MCP server (port 9901). Not for live VLM or incident-range narrative reports.

    by Yash @ Explainx0 comments
  17. Skillvss
    vss-manage-video-io-storage

    Use to call the VIOS REST API (sensor list, timelines, clip extraction, snapshots, add/delete sensors and streams). Not for VLM inference or search.

    by Yash @ Explainx0 comments
  18. Skillvss
    vss-manage-alerts

    Use for VSS alert workflows — real-time monitoring, Alert-Bridge subscriptions, Slack notifications, incident queries, camera onboarding. Not for non-alert analytics.

    by Yash @ Explainx0 comments
  19. Skillvss
    vss-generate-video-calibration

    Use to run AutoMagicCalib on local MP4s, RTSP, or the bundled sample dataset, and to deploy vss-auto-calibration when needed. Do not use for non-AMC calibration or runtime analytics.

    by Yash @ Explainx0 comments
  20. Skilltao
    tao-train-oneformer

    OneFormer for universal image segmentation. Unifies panoptic, instance, and semantic segmentation with a

    by Yash @ Explainx0 comments
  21. Skilltao
    tao-train-visual-changenet

    Visual ChangeNet for binary image classification and segmentation in AOI defect detection. Use when training,

    by Yash @ Explainx0 comments
  22. Skilltao
    tao-train-optical-inspection

    Optical Inspection for defect detection using Siamese networks. Compares image pairs to detect manufacturing

    by Yash @ Explainx0 comments
  23. Skilltao
    tao-train-ocrnet

    OCRNet for scene text recognition. Recognizes text content from cropped text-region images and supports CTC

    by Yash @ Explainx0 comments
  24. Skilltao
    tao-train-reid

    Person re-identification (ReID). Learns discriminative embeddings to match the same person across different

    by Yash @ Explainx0 comments
  25. Skilltao
    tao-train-rtdetr

    RT-DETR (Real-Time DEtection TRansformer) for 2D object detection. Designed for real-time inference with

    by Yash @ Explainx0 comments
  26. Skilltilegym
    tilegym-improve-cutile-kernel-perf

    Iteratively optimize cuTile kernel performance through systematic profiling, bottleneck analysis, IR comparison, and targeted tuning. Covers tile sizes, occupancy, autotune configs, TMA, latency hints, persistent schedu…

    by Yash @ Explainx0 comments
  27. Skilltilegym
    tilegym-converting-cutile-to-triton

    Converts cuTile GPU kernels (@ct.kernel) to Triton (@triton.jit). Handles standard in-repo conversion, debugging (cudaErrorIllegalAddress, shape mismatch, numerical mismatch), and mapping cuTile idioms (ct.load/ct.store…

    by Yash @ Explainx0 comments
  28. Skilltilegym
    tilegym-monkey-patch-kernels-to-transformers

    Integrate TileGym kernels into Hugging Face `transformers` models by replacing the library's submodule(s) and certain class(es)' implementations, and patching certain class(es)' init/forward/load weight methods prior to…

    by Yash @ Explainx0 comments
  29. Skillvss
    vss-deploy-dense-captioning

    Use this skill when deploying standalone RT-VLM dense captioning or calling its REST API (uploads, captions, streams, chat-completions, Kafka). Not for VSS profile deploy or video-search ingestion.

    by Yash @ Explainx0 comments
  30. Skillvss
    vss-ask-video

    Use this skill to ask the VSS agent's video_understanding tool a fresh visual question about a recorded clip. Not for prior tool output, search hits, or metadata-answerable questions.

    by Yash @ Explainx0 comments
  31. Skillvss
    vss-deploy-profile

    Use to select, configure, deploy, verify, debug, or tear down a VSS profile (base, search, lvs, warehouse, edge). Not for standalone microservices — use the vss-deploy-* skill.

    by Yash @ Explainx0 comments
  32. Skillvss
    vss-deploy-detection-tracking-2d

    Use this skill when the user wants to deploy, run, debug, tear down, or call the REST API of the RTVI-CV 2D detection / tracking microservice. Trigger when the user says things like 'deploy rtvi-cv', 'start warehouse 2d…

    by Yash @ Explainx0 comments
  33. Skilltao
    tao-train-pointpillars

    PointPillars for 3D object detection from LiDAR point clouds. Encodes point clouds into a pseudo-image via a

    by Yash @ Explainx0 comments
  34. Skillvss
    vss-generate-video-report

    Use this skill when producing a VSS analysis report — Mode A per-clip VLM, Mode B incident-range via video-analytics. Not for standalone video summarization, real-time alerts or ad-hoc Q&A.

    by Yash @ Explainx0 comments
  35. Skillvss
    vss-deploy-video-embedding

    >

    by Yash @ Explainx0 comments
  36. Skilltao
    tao-train-segformer

    SegFormer for semantic segmentation. Lightweight transformer-based architecture with hierarchical feature

    by Yash @ Explainx0 comments
  37. Skilltao
    tao-generate-referring-expressions

    Four-step image referring-expression pipeline: turns images plus KITTI bounding-box labels into region

    by Yash @ Explainx0 comments
  38. Skilltao
    tao-train-ocdnet

    OCDNet for scene text detection. Detects arbitrary-oriented text regions in natural images using a

    by Yash @ Explainx0 comments
  39. Skilltao
    tao-convert-dataset-format

    Run `tao-daft convert` to convert NVIDIA TAO DAFT datasets between supported formats. Do not use for non-DAFT data.

    by Yash @ Explainx0 comments
  40. Skilltao
    tao-train-nvpanoptix3d

    NVPanoptix3D for panoptic 3D scene reconstruction from posed RGB images. Produces 3D panoptic segmentation

    by Yash @ Explainx0 comments
  41. Skilltao
    tao-train-nvdinov2

    NVDINOv2 for self-supervised visual representation learning. Trains vision transformers via self-distillation

    by Yash @ Explainx0 comments
  42. Skilltao
    tao-train-metric-learning-recognition

    Metric-learning recognition (ml-recog) for fine-grained visual recognition. Learns embeddings for

    by Yash @ Explainx0 comments
  43. Skilltao
    tao-train-mask2former

    Mask2Former for universal image segmentation (panoptic, instance, and semantic). Transformer-based with

    by Yash @ Explainx0 comments
  44. Skilltao
    tao-train-mask-grounding-dino

    Mask Grounding DINO for grounded instance segmentation. Extends Grounding DINO with a mask-prediction head for

    by Yash @ Explainx0 comments
  45. Skilltao
    tao-train-mask-auto-label

    MAL (Mask Auto-Label) for weakly-supervised segmentation. Produces segmentation masks from minimal annotations

    by Yash @ Explainx0 comments
  46. Skilltao
    tao-analyze-gaps-vlm-bcq

    Extract false-positive and false-negative gaps from VLM binary-classification-question (BCQ, yes/no) predictions.

    by Yash @ Explainx0 comments
  47. Skilltao
    tao-generate-image-grounding

    Two-step image grounding pipeline: extracts referring expressions from (image, caption) pairs and grounds them

    by Yash @ Explainx0 comments
  48. Skilltao
    tao-train-fast-foundation-stereo

    Real-time stereo depth estimation using FastFoundationStereo (FFS), the distilled bp2 commercial variant of

    by Yash @ Explainx0 comments
  49. Skilltao
    tao-mine-aoi-images

    Runs the DEFT embed-then-mine workflow for VCN AOI iterations — embeds the gap-analysis target parquet, embeds a source pool, and mines nearest-neighbour source images for downstream augmentation. Use as the immediate n…

    by Yash @ Explainx0 comments