|
|
--- |
|
|
tags: |
|
|
- autotrain |
|
|
- text-generation-inference |
|
|
- text-generation |
|
|
- peft |
|
|
library_name: transformers |
|
|
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct |
|
|
widget: |
|
|
- messages: |
|
|
- role: user |
|
|
content: What is your favorite condiment? |
|
|
license: other |
|
|
--- |
|
|
|
|
|
# Model Trained Using AutoTrain |
|
|
|
|
|
This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain). |
|
|
|
|
|
# Usage |
|
|
|
|
|
```python |
|
|
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig |
|
|
from peft import PeftModel, PeftConfig |
|
|
import torch |
|
|
import bitsandbytes as bnb |
|
|
import time |
|
|
|
|
|
model_name = "Punthon/llama3-5-sdgs" |
|
|
|
|
|
# Load the PEFT configuration |
|
|
peft_config = PeftConfig.from_pretrained(model_name) |
|
|
|
|
|
# Load the tokenizer from the base model |
|
|
tokenizer = AutoTokenizer.from_pretrained(peft_config.base_model_name_or_path) |
|
|
|
|
|
# Load the base model with 8-bit precision |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
peft_config.base_model_name_or_path, |
|
|
load_in_8bit=True, # Load in 8-bit precision |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Resize the base model embeddings to match the tokenizer |
|
|
base_model.resize_token_embeddings(len(tokenizer)) |
|
|
|
|
|
# Load your fine-tuned model |
|
|
model = PeftModel.from_pretrained(base_model, model_name) |
|
|
|
|
|
# Define the instruction and input text |
|
|
instruction = """You are an assistant for tasks in environmental impact assessment (EIA). |
|
|
An excerpt from the textual content of an EIA report is provided by the user. After it, 5 Sustainable Development Goal (SDG) targets are also provided, each target with its corresponding SDG target ID. |
|
|
Please ANSWER by identifying *all* the SDG targets that are relevant to be addressed in the context of the provided excerpt. |
|
|
Please answer to the best of your ability. If you don’t know the answer, just say that you don’t know. |
|
|
Keep the answer concise. When you refer to a target in your answer, always cite the corresponding SDG target ID (which must be among the given SDG targets) between square brackets (e.g. [4.7]), as it’s done in each example. |
|
|
Examples are given below, each example between the ‘<example>’ and ‘</example>’ tags. After that, you are given the actual EIA excerpt so that you identify *all* the relevant SDG targets.""" |
|
|
|
|
|
examples = """ |
|
|
⟨example⟩ |
|
|
EXCERPT: The project focuses on enhancing access to clean water and sanitation facilities in rural communities. |
|
|
SDG TARGETS: |
|
|
Target ID: 6.1 |
|
|
Target: By 2030, achieve universal and equitable access to safe and affordable drinking water for all. |
|
|
ANSWER: This excerpt is relevant to SDG Target [6.1] as it focuses on providing clean water access. |
|
|
⟨/example⟩ |
|
|
|
|
|
⟨example⟩ |
|
|
EXCERPT: The construction of new roads is planned to connect isolated regions with urban areas, facilitating trade and access to services. |
|
|
SDG TARGETS: |
|
|
Target ID: 9.1 |
|
|
Target: Develop quality, reliable, sustainable, and resilient infrastructure, including regional and transborder infrastructure, to support economic development and human well-being, with a focus on affordable and equitable access for all. |
|
|
ANSWER: This excerpt is relevant to SDG Target [9.1] due to its focus on infrastructure development to enhance connectivity. |
|
|
⟨/example⟩ |
|
|
""" |
|
|
|
|
|
# Define the actual EIA excerpt and SDG targets |
|
|
input_text = "Thailand is considered a leader in tiger conservation in Southeast Asia. Most recently at the 'Sustainable Finance for Tiger Landscapes Conservation' conference in Bhutan, Thailand has been declared as the 'Champion for Tiger Conservation in Southeast Asia.'" |
|
|
sdg_targets = """ |
|
|
SDG TARGETS: |
|
|
Target ID: 15.1 |
|
|
Target: By 2020, ensure the conservation, restoration, and sustainable use of terrestrial and inland freshwater ecosystems and their services, in particular forests, wetlands, mountains, and drylands, in line with obligations under international agreements. |
|
|
Target ID: 15.2 |
|
|
Target: By 2020, promote the implementation of sustainable management of all types of forests, halt deforestation, restore degraded forests, and substantially increase afforestation and reforestation globally. |
|
|
Target ID: 15.5 |
|
|
Target: Take urgent and significant action to reduce the degradation of natural habitats, halt the loss of biodiversity and, by 2020, protect and prevent the extinction of threatened species. |
|
|
""" |
|
|
|
|
|
# Format the prompt |
|
|
prompt = f""" |
|
|
{instruction} |
|
|
{examples} |
|
|
Now, your task. |
|
|
EXCERPT: {input_text} |
|
|
{sdg_targets} |
|
|
ANSWER: |
|
|
""" |
|
|
|
|
|
# Define generation configuration |
|
|
generation_config = GenerationConfig( |
|
|
do_sample=True, |
|
|
top_k=30, |
|
|
temperature=0.7, |
|
|
max_new_tokens=200, |
|
|
repetition_penalty=1.1, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
# Tokenize input |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
|
|
|
|
# Generate outputs |
|
|
st_time = time.time() |
|
|
outputs = model.generate(**inputs, generation_config=generation_config) |
|
|
|
|
|
# Decode and print response |
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(f"Response time: {time.time() - st_time} seconds") |
|
|
print(response) |
|
|
|
|
|
|
|
|
``` |