
Notes - Grounded SAM
What is Grounded SAM? The Grounded SAM paper introduces a novel approach to open-set segmentation by combining two powerful pre-trained models...
A collection of 17 posts tagged with "blog".
What is Grounded SAM? The Grounded SAM paper introduces a novel approach to open-set segmentation by combining two powerful pre-trained models...
Can LLMs actually reason, or are they just “probabilistic pattern matchers”? This paper attempts to answer that question.
Here is a short compilation of bullet points gathered while reading the paper "Retrieval-Augmented Generation for Large Language Models: A Survey".
The main motivation behind YOLOX was to update the YOLO series with the recent advancements at the time, particularly anchor-free detection.
This is a short review of the paper titled "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics" by Kendall et al, 2018.
Here is a short tutorial on how to fit polynomials using pytorch.
I recently decided to migrate my ghost blog from ghost(pro) subscription to a digital ocean droplet. Primary reasons for the migration were...
SVD (Singular Value Decomposition) is one of my favorite topics in linear algebra. It's almost magical to factorize any matrix...
Authors propose a new framework of loss functions, motivated by the Taylor series expansion of commonly used functions like cross entropy.
Natural language processing (NLP) is a branch of science sitting at the intersection of computer science, artificial intelligence, and computational linguistics.
If you're into gaming and deep learning, you need to own a GPU. For years I was working with an older GPU (GTX 960M), but I thought it was time to upgrade.
EfficientNet tries to come up with a smart heuristic to scale a CNN, relating resolution, width, and depth of a CNN.
What exactly are we trying to accomplish? Will the new model architecture really be a game-changer? How much impact will this new dataset have...?
Semantic segmentation involves partitioning/marking regions in the image belonging to different objects/classes. This short article summarises DeepLab V3+...
Once we've trained multiple detection/classification models, how to choose the best model? Once we've chosen the best model, how to choose the optimum operating point?