Yannis Katsis commited on
Commit
55473a2
·
1 Parent(s): 2c0ff00

Update README.md for citations

Browse files
Files changed (1) hide show
  1. citations/README.md +8 -17
citations/README.md CHANGED
@@ -3,18 +3,13 @@ license: apache-2.0
3
  language:
4
  - en
5
  pipeline_tag: text-generation
6
- base_model: ibm-granite/granite-3.3-8b-instruct
7
  library_name: peft
8
  library_name: transformers
9
  ---
10
 
11
  # Intrinsics for Citation Generation
12
 
13
- The following are experimental releases. They are still under development, but we wanted to let the open-source community take them for spin! Use them, break them, and help us build what's next – we'll keep an eye out for feedback and questions. Happy exploring!
14
-
15
- Just a heads-up: Experiments are forever evolving, so we can't commit to ongoing support or guarantee performance.
16
-
17
- # Model Summary
18
 
19
  This is a RAG-specific family of intrinsics fine-tuned for the citation generation task. Given a multi-turn conversation between a user and an AI assistant ending with an assistant response and a set of documents/passages on which the last assistant response is supposed to be based, each intrinsic in the family generates citations for the last assistant response from the provided documents/passages. The intrinsic has the following features:
20
  1. **Fine-grained citations:** The intrinsic generates citations for each sentence in the assistant response (when available). Moreover, each citation consists of a set of sentences from the documents/passages that support the corresponding sentence in the assistant response.
@@ -29,7 +24,7 @@ We provide two intrinsics implemented as LoRA adapters trained over Granite-3.3-
29
  - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
30
 
31
  ## Intended use
32
- This is a a family of citation generation intrinsics that give the ability to generate citations for the last assistant response in a multi-turn RAG conversation based on a set of provided documents/passages. They can be used to generate post-hoc citations for assistant responses generated by any LLM in a RAG setting.
33
 
34
  > [!TIP]
35
  > Note: While you can invoke a citation generation intrinsic directly, it is strongly recommended to call it through [granite-common](https://github.com/ibm-granite/granite-common), which wraps the model with a tailored I/O processor, enabling a friendlier development interface. The I/O processor takes care of several data transformation/validation tasks that would be otherwise required (incl. splitting the input documents and assistant response into sentences before calling the intrinsic as well as validating the intrinsic's output and transforming the returned sentence IDs into spans over the documents and the response). We next describe the input/output of the citation generation intrinsics when invoked through granite-common.
@@ -42,7 +37,7 @@ This is a a family of citation generation intrinsics that give the ability to ge
42
 
43
  ## Quickstart Example
44
 
45
- To run the citation generation intrinsics through granite-common, you can either (a) use an OpenAI-compatible inference backend, such as vLLM or (b) use the Hugging Face transformers library. We provide below instructions for each of the two approaches. Note that running inference using vLLM or another scalable OpenAI-compatible inference backend should be significantly faster than using the Hugging Face transformers library directly.
46
 
47
  ### Using an OpenAI-Compatible Inference Backend
48
 
@@ -50,8 +45,7 @@ To run the intrinsic using an OpenAI-compatible inference backend, such as vLLM,
50
 
51
  1. Install the granite-common library:
52
  ```
53
- pip install git+https://github.com/ibm-granite/granite-common.git
54
- pip install granite_common[nltk]
55
  ```
56
 
57
  2. Install the Hugging Face CLI:
@@ -137,8 +131,7 @@ To run the intrinsic using an OpenAI-compatible inference backend, such as vLLM,
137
  request_json["temperature"] = 0.0
138
 
139
  # Apply input processor
140
- intrinsic_kwargs = {}
141
- rewritten_request = rewriter.transform(request_json, **intrinsic_kwargs)
142
 
143
  # Run inference
144
  client = openai.OpenAI(base_url=openai_base_url, api_key=openai_api_key)
@@ -157,12 +150,11 @@ To run the intrinsic using an OpenAI-compatible inference backend, such as vLLM,
157
 
158
  ### Using the Hugging Face Transformers Library
159
 
160
- To run the intrinsic using the Hugging Face transformers library directly, follow the steps below. We recommend using Python 3.11 or higher.
161
 
162
  1. Install the granite-common library:
163
  ```
164
- pip install git+https://github.com/ibm-granite/granite-common.git
165
- pip install granite_common[nltk]
166
  ```
167
 
168
  2. Install the Hugging Face CLI:
@@ -239,8 +231,7 @@ To run the intrinsic using the Hugging Face transformers library directly, follo
239
  request_json["temperature"] = 0.0
240
 
241
  # Apply input processor
242
- intrinsic_kwargs = {}
243
- rewritten_request = rewriter.transform(request_json, **intrinsic_kwargs)
244
 
245
  # Load the base model and merge LoRA weights
246
  model, tokenizer = granite_common.util.load_transformers_lora(lora_dir)
 
3
  language:
4
  - en
5
  pipeline_tag: text-generation
 
6
  library_name: peft
7
  library_name: transformers
8
  ---
9
 
10
  # Intrinsics for Citation Generation
11
 
12
+ ## Model Summary
 
 
 
 
13
 
14
  This is a RAG-specific family of intrinsics fine-tuned for the citation generation task. Given a multi-turn conversation between a user and an AI assistant ending with an assistant response and a set of documents/passages on which the last assistant response is supposed to be based, each intrinsic in the family generates citations for the last assistant response from the provided documents/passages. The intrinsic has the following features:
15
  1. **Fine-grained citations:** The intrinsic generates citations for each sentence in the assistant response (when available). Moreover, each citation consists of a set of sentences from the documents/passages that support the corresponding sentence in the assistant response.
 
24
  - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
25
 
26
  ## Intended use
27
+ This is a family of citation generation intrinsics that give the ability to generate citations for the last assistant response in a multi-turn RAG conversation based on a set of provided documents/passages. They can be used to generate post-hoc citations for assistant responses generated by any LLM in a RAG setting.
28
 
29
  > [!TIP]
30
  > Note: While you can invoke a citation generation intrinsic directly, it is strongly recommended to call it through [granite-common](https://github.com/ibm-granite/granite-common), which wraps the model with a tailored I/O processor, enabling a friendlier development interface. The I/O processor takes care of several data transformation/validation tasks that would be otherwise required (incl. splitting the input documents and assistant response into sentences before calling the intrinsic as well as validating the intrinsic's output and transforming the returned sentence IDs into spans over the documents and the response). We next describe the input/output of the citation generation intrinsics when invoked through granite-common.
 
37
 
38
  ## Quickstart Example
39
 
40
+ To run the citation generation intrinsics through granite-common, you can either (a) use an OpenAI-compatible inference backend, such as vLLM or (b) use the Hugging Face Transformers library. We provide below instructions for each of the two approaches. Note that running inference using vLLM or another scalable OpenAI-compatible inference backend should be significantly faster than using the Hugging Face Transformers library directly.
41
 
42
  ### Using an OpenAI-Compatible Inference Backend
43
 
 
45
 
46
  1. Install the granite-common library:
47
  ```
48
+ pip install granite-common[nltk]
 
49
  ```
50
 
51
  2. Install the Hugging Face CLI:
 
131
  request_json["temperature"] = 0.0
132
 
133
  # Apply input processor
134
+ rewritten_request = rewriter.transform(request_json)
 
135
 
136
  # Run inference
137
  client = openai.OpenAI(base_url=openai_base_url, api_key=openai_api_key)
 
150
 
151
  ### Using the Hugging Face Transformers Library
152
 
153
+ To run the intrinsic using the Hugging Face Transformers library directly, follow the steps below. We recommend using Python 3.11 or higher.
154
 
155
  1. Install the granite-common library:
156
  ```
157
+ pip install granite-common[nltk]
 
158
  ```
159
 
160
  2. Install the Hugging Face CLI:
 
231
  request_json["temperature"] = 0.0
232
 
233
  # Apply input processor
234
+ rewritten_request = rewriter.transform(request_json)
 
235
 
236
  # Load the base model and merge LoRA weights
237
  model, tokenizer = granite_common.util.load_transformers_lora(lora_dir)