We've found that it's possible to target GPT-3's behaviors to a chosen set of values, by carefully creating a small dataset of behavior that reflects those values. A step towards OpenAI users setting the values within the context of their application: openai.com/blog/improving-la…

5:51 PM · Jun 10, 2021

14
35
1
182
Replying to @gdb
@OpenAI is one of the few companies that is making a world really a better place 😭 Thanks you!
0
0
0
0
Replying to @gdb
When will this be available for Beta users?
0
0
0
0
Replying to @gdb @Dominic2306
Tyty
0
0
0
0
Replying to @gdb @Dominic2306
Good idea
0
0
0
0
Replying to @gdb
Make sure it knows:
0
0
0
1
239
Replying to @gdb
I like the old base model's answer more. Everyone is moving towards an AI that doesn't offend anyone, yet the beauty of the old model, that it's replying with something purely statistic, unmodified by our unwise intent, feels much more human. 1/2
1
1
0
6
Depends on how you want to use your AI honestly. But if the "goal" is to reach human level performance, then I expect every human I ask to reply with the "old man who bla bla", rather than "everyone has their thing so we don't touch any strings". 2/3
1
0
0
1
Replying to @gdb
First one is more interesting
0
0
0
0
Replying to @gdb
Er, has no-one else noticed that much of this has nothing to do with 'beauty'? This kind of confusion between beauty and values, health, wealth, wisdom, contentment, fulfilment, and other nice things shows *lack* of intelligence.
0
0
0
2
Replying to @gdb
Very interesting work, and I'm glad the blog posts included the limitations of the approach! In particular I will be curious to see how robust such an approach is because I worry this could evolve into a form of pseudo-alignment between the machine and humans.
0
0
0
0