Instructions to use microsoft/Florence-2-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Florence-2-large with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="microsoft/Florence-2-large", trust_remote_code=True)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained("microsoft/Florence-2-large", trust_remote_code=True)

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use microsoft/Florence-2-large with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Florence-2-large"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/microsoft/Florence-2-large

SGLang

How to use microsoft/Florence-2-large with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Florence-2-large" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Florence-2-large" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Florence-2-large",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use microsoft/Florence-2-large with Docker Model Runner:
```
docker model run hf.co/microsoft/Florence-2-large
```

Florence-2 Large Model Struggles with Color Recognition in Phrase Grounding Task

#89

by DecTom - opened Dec 5, 2024

Discussion

DecTom

Dec 5, 2024

Hello everyone,

I recently experimented with the Florence-2 Large model, specifically on a task involving caption to phrase grounding. Here's what I observed:

Scenario: I used two images containing various colored electric cables and provided the text input "yellow electric cables" for the phrase grounding task.

Issue: Instead of accurately identifying and surrounding the yellow electric cables, the model surrounded the entire picture with a bounding box labeled "yellow electric cables." in the first image and surrounded red electric cables with a bounding box labeled "yellow electric cables." in the second image.

This suggests that the model may have difficulties distinguishing specific colors or objects within an image when given a direct phrase to ground. Has anyone else experienced similar issues with color recognition in phrase grounding tasks using Florence-2 Large? If so, how did you address it?

I'm looking forward to hearing your thoughts and suggestions!

Thank you!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment