Biased or Censored Completions - Early ChatGPT vs Current Behavior
I've been exploring various AI models recently, especially with the anticipation of building a new PC. While waiting, I've compiled a list of models I plan to download and try:
- WizardLM
- Vicuna
- WizardVicuna
- Manticore
- Falcon
- Samantha
- Pygmalion
- GPT4-x-Alpaca
However, given the large file sizes, I need to be selective about the models I download, as LLama 65b is already consuming a substantial amount of space.
Now, let's discuss the topic of biased or censored completions. One example of a biased completion would be if the AI model consistently favors a specific political ideology or viewpoint, disregarding alternative perspectives.
As for the evolution of ChatGPT's behavior, the early versions differed in certain aspects compared to its current behavior. OpenAI has made efforts to align the model with human values and reduce biases. The pre-release version of GPT4, in particular, raised concerns during red teaming, but the subsequent adjustments addressed many of those issues.
Among the models I've tried, I must highlight Manticore-13B for its surprising excellence and performance. Additionally, I found the WizardLM models to be the best when it comes to uncensored completions. By using the appropriate prompt, such as main -m WizardLM-7B-uncensored.ggmlv3.q5_1.bin --color --threads 12 --batch_size 256 --n_predict -1 --top_k 12 --top_p 1 --temp 0.0 --repeat_penalty 1.05 --ctx_size 2048 --instruct --reverse-prompt "### Human:"
, you can activate the desired model, replacing it with your preferred choice.
For downloading these models, you can visit ai.torrents.luxe. While the MPT and Falcon models are already available there, you have the flexibility to upload other models of your choice, as long as you follow the specified guidelines.
Regarding the restricted access to the 65B model, I'm not aware of the specific source that confirms this information. However, it's essential to consider preserving datasets as they are prone to disappearance, especially if platforms like Hugging Face succumb to pressure from large companies. Creating a torrent for the datasets might be a viable solution in such cases.
In my experience, ChatGPT has demonstrated a relatively unbiased nature, often providing reasonable opinions that encompass multiple viewpoints on a given problem. In fact, I find it more comfortable relying on ChatGPT for educating myself about political issues compared to traditional news sources. It's worth noting that while ChatGPT aims to minimize biases, complete neutrality is challenging to achieve, as biases can emerge from the underlying data scraped from the internet.
Lastly, I'm unsure about the context of "companies and organizations cracking down on 65B model access." Could you please provide more information or clarify the reference?
On a final note, the LLAMA 65B raw model is of great significance to preserve as it represents the most intelligent model with publicly available weights and remains completely uncensored.
Even if restrictions were imposed on sharing models via Hugging Face, alternative methods like BitTorrent, mailing SD cards, or organizing LLM LAN parties (fictional) could still facilitate the distribution of models.