Update README.md
Browse files
README.md
CHANGED
|
@@ -113,59 +113,3 @@ generated_text
|
|
| 113 |
|
| 114 |
|
| 115 |
|
| 116 |
-
## Usage and limitations
|
| 117 |
-
|
| 118 |
-
### Intended usage
|
| 119 |
-
|
| 120 |
-
Open Vision Language Models (VLMs) have a wide range of applications across
|
| 121 |
-
various industries and domains. The following list of potential uses is not
|
| 122 |
-
comprehensive. The purpose of this list is to provide contextual information
|
| 123 |
-
about the possible use-cases that the model creators considered as part of model
|
| 124 |
-
training and development.
|
| 125 |
-
|
| 126 |
-
* The model can be further fine-tuned on bigger and better dataset or your own custom dataset
|
| 127 |
-
* The model can be used in apps to provide real-time visual and text-based assistance in Hindi and English.
|
| 128 |
-
* The model can be a tool for researchers to develop new vision-language technologies and applications.
|
| 129 |
-
|
| 130 |
-
### Ethical considerations and risks
|
| 131 |
-
|
| 132 |
-
The development of vision-language models (VLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following:
|
| 133 |
-
|
| 134 |
-
* Bias and Fairness
|
| 135 |
-
* VLMs trained on large-scale, real-world image-text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card.
|
| 136 |
-
* Misinformation and Misuse
|
| 137 |
-
* VLMs can be misused to generate text that is false, misleading, or harmful.
|
| 138 |
-
* Guidelines are provided for responsible use with the model, see the [Responsible Generative AI Toolkit](https://ai.google.dev/responsible).
|
| 139 |
-
* Transparency and Accountability
|
| 140 |
-
* This model card summarizes details on the models' architecture, capabilities, limitations, and evaluation processes.
|
| 141 |
-
* A responsibly developed open model offers the opportunity to share innovation by making VLM technology accessible to developers and researchers across the AI ecosystem.
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
Risks identified and mitigations:
|
| 145 |
-
|
| 146 |
-
* **Perpetuation of biases:** It's encouraged to perform continuous monitoring
|
| 147 |
-
(using evaluation metrics, human review) and the exploration of de-biasing
|
| 148 |
-
techniques during model training, fine-tuning, and other use cases.
|
| 149 |
-
* **Generation of harmful content:** Mechanisms and guidelines for content
|
| 150 |
-
safety are essential. Developers are encouraged to exercise caution and
|
| 151 |
-
implement appropriate content safety safeguards based on their specific
|
| 152 |
-
product policies and application use cases.
|
| 153 |
-
* **Misuse for malicious purposes:** Technical limitations and developer and
|
| 154 |
-
end-user education can help mitigate against malicious applications of LLMs.
|
| 155 |
-
Educational resources and reporting mechanisms for users to flag misuse are
|
| 156 |
-
provided. Prohibited uses of Gemma models are outlined in the [Gemma
|
| 157 |
-
Prohibited Use Policy](https://ai.google.dev/gemma/prohibited_use_policy).
|
| 158 |
-
* **Privacy violations:** Models were trained on data filtered to remove certain personal information and sensitive data. Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques.
|
| 159 |
-
|
| 160 |
-
### Limitations
|
| 161 |
-
|
| 162 |
-
* Most limitations inherited from the underlying Gemma model still apply:
|
| 163 |
-
* VLMs are better at tasks that can be framed with clear prompts and
|
| 164 |
-
instructions. Open-ended or highly complex tasks might be challenging.
|
| 165 |
-
* Natural language is inherently complex. VLMs might struggle to grasp
|
| 166 |
-
subtle nuances, sarcasm, or figurative language.
|
| 167 |
-
* VLMs generate responses based on information they learned from their
|
| 168 |
-
training datasets, but they are not knowledge bases. They may generate
|
| 169 |
-
incorrect or outdated factual statements.
|
| 170 |
-
* VLMs rely on statistical patterns in language and images. They might
|
| 171 |
-
lack the ability to apply common sense reasoning in certain situations.
|
|
|
|
| 113 |
|
| 114 |
|
| 115 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|