I gave GPT-3 access to Chrome with the objective "please buy me Airpods". Pretty interesting if you ask me ๐Ÿค”

9:49 AM ยท Jun 17, 2021

46
128
49
909
41,554
It successfully made it to the product page, but got sidetracked with Walmart's privacy policy.
6
1
0
73
Since even a simplified DOM is far too large for a single prompt, multiple prompts are given different chunks of the DOM, each generating their own "interaction". Another prompt then takes all the proposed interactions and selects the best one, sort of like a tournament bracket.
4
1
0
48
For more complex web pages, the time it takes to generate an action scales at O log(n) with the size of the DOM โ€“ really fast! It also gets around token limits, so you could technically process an infinitely large DOM!
1
0
0
46
Replying to @sharifshameem
How does that work conceptually?
1
0
0
3
What I meant is: GPT-3 is "just" a language model. How does it have a goal or navigate a website? (honest question)
1
0
0
16
Replying to @sharifshameem
This is amazing actually! Super curious about the implementation ๐Ÿค”
1
0
0
3
Replying to @wichmaennchen
Each prompt contains a simplified version of the DOM, the objective, and possible actions (like click on an element).
0
0
0
5
Replying to @sharifshameem
Open AI has some issues with selecting context. I tried with few different ways but it couldnt select a number or a word from given list. I was thinking to come up with a different solution on search api. I think search api can be better way to find correct button.
0
0
0
2
Replying to @sharifshameem
Very cool. Have you thought about any other interesting use cases for this? It would be interesting to imagine a day in which websites designed their dom structures to make it easy for GPT-3 bots to use them. A bit like purposely designing motorways for self-driving cars
0
0
0
18
Replying to @sharifshameem
This was without finetuning?
1
0
0
4
No fine tuning, just few shot prompting
0
0
0
5