Skip to content

Commit 59d0db1

Browse files
committed
updated version of Florence2, Qwen2.5 and Gemini 2.5 notebooks
1 parent 5d3f62c commit 59d0db1

4 files changed

+553
-4217
lines changed

notebooks/how-to-finetune-florence-2-on-detection-dataset.ipynb

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3108,20 +3108,15 @@
31083108
"\n",
31093109
"---\n",
31103110
"\n",
3111+
"[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/how-to-finetune-florence-2-on-detection-dataset.ipynb)\n",
31113112
"[![Roboflow](https://raw.githubusercontent.com/roboflow-ai/notebooks/main/assets/badges/roboflow-blogpost.svg)](https://blog.roboflow.com/florence-2/)\n",
31123113
"[![arXiv](https://img.shields.io/badge/arXiv-2311.06242-b31b1b.svg)](https://arxiv.org/abs/2311.06242)\n",
31133114
"\n",
31143115
"Florence-2 is a lightweight vision-language model open-sourced by Microsoft under the MIT license. The model demonstrates strong zero-shot and fine-tuning capabilities across tasks such as captioning, object detection, grounding, and segmentation.\n",
31153116
"\n",
3116-
"![Florence-2 Figure.1](https://storage.googleapis.com/com-roboflow-marketing/notebooks/examples/florence-2-figure-1.png)\n",
3117-
"\n",
3118-
"*Figure 1. Illustration showing the level of spatial hierarchy and semantic granularity expressed by each task. Source: Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks.*\n",
3119-
"\n",
31203117
"The model takes images and task prompts as input, generating the desired results in text format. It uses a DaViT vision encoder to convert images into visual token embeddings. These are then concatenated with BERT-generated text embeddings and processed by a transformer-based multi-modal encoder-decoder to generate the response.\n",
31213118
"\n",
3122-
"![Florence-2 Figure.2](https://storage.googleapis.com/com-roboflow-marketing/notebooks/examples/florence-2-figure-2.png)\n",
3123-
"\n",
3124-
"*Figure 2. Overview of Florence-2 architecture. Source: Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks.*\n",
3119+
"![Florence-2 Figure.1](https://storage.googleapis.com/com-roboflow-marketing/notebooks/examples/florence-2-figure-1.png)\n",
31253120
"\n"
31263121
],
31273122
"metadata": {
@@ -5296,9 +5291,19 @@
52965291
{
52975292
"cell_type": "markdown",
52985293
"source": [
5299-
"# Congratulations\n",
5294+
"<div align=\"center\">\n",
5295+
" <p>\n",
5296+
" Looking for more tutorials or have questions?\n",
5297+
" Check out our <a href=\"https://github.com/roboflow/notebooks\">GitHub repo</a> for more notebooks,\n",
5298+
" or visit our <a href=\"https://discord.gg/GbfgXGJ8Bk\">discord</a>.\n",
5299+
" </p>\n",
5300+
" \n",
5301+
" <p>\n",
5302+
" <strong>If you found this helpful, please consider giving us a ⭐\n",
5303+
" <a href=\"https://github.com/roboflow/notebooks\">on GitHub</a>!</strong>\n",
5304+
" </p>\n",
53005305
"\n",
5301-
"⭐️ If you enjoyed this notebook, [**star the Roboflow Notebooks repo**](https://https://github.com/roboflow/notebooks) (and [**supervision**](https://github.com/roboflow/supervision) while you're at it) and let us know what tutorials you'd like to see us do next. ⭐️"
5306+
"</div>"
53025307
],
53035308
"metadata": {
53045309
"id": "ag0XROk7fcd_"

0 commit comments

Comments
 (0)