Skip to content

Author: admin

A Spatial Vision for Author & Reader

Work for porting the software Author and Reader to visionOS is necessarily explorative. These are my personal notes to guide the research, based on experience and personal preference to be revised as we build experiments to experience what it actually feels like and learn what really works in this new environment. A design constraint is that the user should be able to seamlessly move between working in spatial and traditional environments. The practical opportunities available for coding in visionOS and the resources available are not unconstrained. We have therefore had to decide on which problem to solve first. Since space is the key difference with this medium, we have decided to go large and build a Defined Concepts Map, as already implemented in Author macOS.

  

Initial Problem to Solve : Mapping Concepts

Author supports the user defining concepts which can be viewed on a Map, something which testing shows can be more powerful in an expanded space where the Map can cover a larger area.

Integration with other views can include the ability for the user to toggle the Map being a background or foreground element, such as the main document view and document outline:

Above, schematic view of basic Author views. Below, same Author views tested in visionOS:

 

Layout Tests

Layouts have been prototyped and experimented with to experience in currently available headsets, such as the Quest Pro:

 

Interaction Test

A test implementation by Brandel is available to try on any headset. To use, open the following link in Google Chrome then drag an Author document (.liquid) onto the bar top left. You will then see the Map layout and you can interact with it. You can also choose to view this in your VR headset and then you will need to visit the same location to interact with the Map: https://t.co/nEIoUpiUsW

 

Currently implemented Map in macOS

The Defined Concepts Map in Author. The potential for porting to visionOS is truly powerful. A key aspect of the Map is that all connections are not shown all the time, which produces a cluttered view, but only when nodes are selected, allowing for more fine grained understanding of how the concepts relate:

    

Multiple Maps

It is not trivial, but it is relatively easy to create a Map of nodes of a large size. The issue becomes one of control of views.

The integration of different Maps, both in terms of data access and use and with the user interactions, is expected to be the main challenge, to integrate maps such as listed below, with questions such as inn visionOS, should these be accessed as windows, volumes or spaces, or–ideally–anyone the user chooses?

  • Defined Concepts
  • Library of References
  • Journal Research Notes
  • User’s articles
  • Online Resources

  

Interactions & Connections

Below is an illustration of several such Maps. Key design challenge will be how to move between them, how to see connections, how to change layout and views and how they can be nested:

Below is a set of commands for the user to specify what should be seen in a Map, for consideration as to what options should be visible in spatial environments:

  

Openness & Metadata Interoperability

The defined concepts are subsequently exported as Glossaries currently, and work should be done to explore including the layout of the Map view on export. Such a Map view should be extractable from documents when reading and in the Library, so there needs to be a visual and interaction means for the user to easily understand the different Maps and what scale they related to.

Early Tests

  

Discussion of further Work Modes

    

Authoring

Writing a short document should be relatively similar to writing on a laptop or desktop.

Writing a longer document becomes more of an issue of editing, with moving elements around and seeing how they related. This can likely benefit from innovative views in a larger space, going further than Author’s fold views, to maybe being able to move section off into separate displays to work on them, while still being part of the whole.

  

Reading

Finding what to read in a document can likely benefit from innovative views of the document, going further than what we have done in Reader, including showing all pages on a very large ‘wall’ display.

Reading pages or screens of text in this environment should be optimised for what is most comfortable for the eyes.

  

Library

A Library which supports innovative views may also benefit from a fully immersive space in order for the user to be able to arrange documents/books in space using layouts with specific themes/topics etc.

In order for a Library to have open books, so to speak, the metadata needs to be readily available. Visual-Meta as well as JSON and other means will be investigated to support this.

    

Research Journal/Notes

Another challenge is how to augment the user’s ability to take notes and then find their notes later, including notes on documents for citing them.

This is probably one area where spatial environments can really shine, with easily manipulable ‘murder wall’s and other types of views being deployed.

University of Southampton, and I was a former teacher of the year 2014 at London College of Communication.

  

  

Frode Hegland
June 2023

 

 

the future of text, the future of thought, is richly multidimensional

  

  

   

VR

There is no question in my mind that over the next few years headsets for VR and AR will become commonplace with anyone who works with text.

Issues to overcome include reducing the screen door effect so that reading text becomes pleasant, weight and comfort for longer sessions cost and maybe most importantly and surprising that it has not yet been solved: Instant access to the user’s information landscape and colleagues (though I do expect Apple to make some major advances here).

Opportunities initially include space, space and more space to think and work. Beyond this there is almost limitless potential for text and information interaction in general, something we have only started to scratch the virtual surface of.

The importance of today. I feel quite strongly that it will be important to develop this entirely new, all-encompassing environment as well as possible and not only based on the initial commercial explorations of big tech companies. This is why I have been experimenting to experience and hosting dialog on the future of text in VR through the Lab.

‘Ask AI’ in Reader

Because AI queries while writing are likely quite different than when reading, we have added AI to Reader directly.

Goal: Give users the power of AI when reading to help them evaluate & understand the material and how it connects to other knowledge in their field.

  

Issue Command

User ctrl-clicks on selected text and a new command ‘Ask AI’, menu appears at the very top of the list:

  

This menu has sub items, as listed here (subject to editing):

  • ‘What is this’ (W)
  • ‘Show me more examples’ (S)
  • ‘What does this relate to’ (R)
  • ‘Show me counter examples’ (C)
  • ‘Is this correct?’ (I)
  • ‘Explain the concept of (E)’
  • ‘Create a timeline of this’ (T)
  • ‘Discuss the causes and effects of’ 
  • ‘Edit’ (which opens Reader’s Preferences to allow the user to design their own)

    

Results

The results are shown in a floating window where the query text is shown in a box, as below, and the results below. The text can be interacted with to copy it. To dismiss simply close the window.

  

Preferences

Users will need to supply their own OpenAI API keys (same as the approach for AI in Liquid and macGPT) where they can also turn on and off commands.

Clicking [+] produces this dialog, same as clicking [Edit] for a current command:

  

  

‘Ask’ GPT in Liquid

After getting the MacGPT app, which is essentially toolbar access to GPT, and after user requests, I realised this can’t be that hard to implement in Liquid, so here is the plan: Use Liquid as an interface to take whatever the user comes across or writes and in a few clicks send to an AI system (GPT) with a custom prompt, to help both students, teachers and general knowledge workers:

  

Interaction

A new top level command in Liquid called ‘Ask’, with shortcut (A) to send selected text to ChatGPT, with an associated prompt:

The sub menu contains options to choose a prompt/how to preface the selected text (not all are on/visible by default):

  • ‘What is’ (W)
  • ‘Write in academic language’ (A)
  • ‘Show me more examples’ (S)
  • ‘What relates to’ (R)
  • ‘Show me counter examples to’ (C)
  • ‘Is this correct?’ (I)
  • ‘Check for plagiarism’ (P)
  • ‘Explain the concept of (E)’
  • ‘Create a timeline of’ (T)
  • ‘Discuss the causes and effects of’
  • Create a quiz with 5 multiple choice questions that assess students’ understanding of
  • ‘Edit’ (which opens Liquid’s Preferences to allow the user to design their own)

  

Results

Since the API can be slow, as can be seen when using MacGPT and other interfaces, there will be a flashing cursor while waiting for the results. If it is easier to produce the results in a web view, then we will do that.

Note, as in the error for 1980, AI is not at the stage where it can be trusted to always be correct, and maybe this will never happen. Nevertheless, it is a tool and user’s need to learn how to use it, including checking what it produces:

Development note: This should ideally be presented in a non-full screen, floating window, for the user to dismiss when done or leave open.

  

Preferences/Key (how it works)

Here the user will be able to customise and make their own preface text/prompts. Enter a name, shortcut and full text of prompt/preface text to send to Chat PGT:

Preferences is also where user’s add their own API Keys for GPT, as inspired by how MacGPT does it, and also option to choose model.

On first try of an AI service, Liquid will show a dialog asking for the API key. If dismissed, it will simply ask for it again on next attempt.

Future updates should be able to let the user choose other AI models, including Google Bard.

  

Notes on longer prompts

Some of the actual prompts will be longer than indicated above. This will need some basic experimenting. For example:

Check for plagiarism: I want you to act as a plagiarism checker. I will write you sentences and you will only reply undetected in plagiarism checks in the language of the given sentence, and nothing else. Do not write explanations on replies. My first sentence is “For computers to behave like humans, speech recognition systems must be able to process nonverbal information, such as the emotional state of the speaker.”

Increased Space

Although the act of writing is an intimate affair, where even a 13″ laptop screen can be ideal, allowing the author to focus, the act of editing and constructing a large document and thinking about connections can benefit from a larger display.

The 27″ Apple Studio Display really does provide some more space to see and to think.

Almost like XR in scale, though of course there is no third dimension. It was the act of working in VR which really showed me how more space helps however. If the current headsets were less likely to loose connection to my Macbook and had less screen door effect, I might not have needed to purchase this screen and I would have had the benefit of an even more flexible, and portable workspace.

I went from this when working in the Map view in Author:

showing workspace of 13″

to this on the Studio Display:

showing workspace of 27″

Document Links

Based on having document names (not only titles) stored in Visual-Meta when creating a reference in Author, and this being available in Reader, the following should be possible:

User Action

If the user has downloaded the document which is cited (linked to), and it is in a folder known to Reader (or a sub-folder therein), then the user should be able to click on a citation and the local document should open, not a web address.

Premise

  • The user has already downloaded the document cited.
  • The document name has not changed.

Questions

  • Can the folder have folders inside it?
  • Is it much work for Reader to check if, on user clicking a citation in the document like this [1] if the document linked to is on the hard drive.

Doug Engelbart

Doug was my friend and mentor. His augmentation framework, which was presented in his 1962 paper, still informs and inspires what I do.

“We need to improve how we augment a group’s (small, large, internal, global etc.) capability to approach urgent, complex problems to gain more rapid and better comprehension (which can be defined as more thorough and more critical) which result in speedier and better solutions (more contextual, longer lasting, cheaper, more equitable etc.). And furthermore, we must improve our improvement process (as individuals and groups).”

Douglas Engelbart

My friend Fleur and I made a brief web based documentary with him. None of the originally uploaded videos are playable, so I have uploaded them to YouTube. To me, this is an example of the brittleness of ‘rich media’ and a reminder how important it is to have our knowledge also stored in robust media, such as text.

He told me how it all started:

…the world is very complex if you are trying figure out what you would fix, etc., and how you’ll go up trying to fix it. And one Saturday I – God – the world is so damn complex it’s hard to figure out.

And that’s what then dawned on me that, oh, the very thing: It’s very complex. It’s getting more complex than ever at a more rapid rate that these problems we’re facing have to be dealt with collectively. And our collective ability to deal with complex urgent problems isn’t increasing at anything like the parent rate that it’s going to be that the problems are.

So if I could contribute as much as possible, so how–generally speaking–mankind can get more capable in dealing with complex urgent problems collectively, then that would be a terrific professional goal. So that’s… It was 49 years ago. And that’s been ever since.

Douglas Engelbart

His Wikipedia entry starts with:

Douglas Carl Engelbart (January 30, 1925 – July 2, 2013) was an American engineer and inventor, and an early computer and Internet pioneer. He is best known for his work on founding the field of human–computer interaction, particularly while at his Augmentation Research Center Lab in SRI International, which resulted in creation of the computer mouse, and the development of hypertext, networked computers, and precursors to graphical user interfaces. These were demonstrated at The Mother of All Demos in 1968. Engelbart’s law, the observation that the intrinsic rate of human performance is exponential, is named after him.

Wikipedia

He wrote the following in an email September 2003, a statement which still provides me with joy and energy to continue the work on the future of text:

I honestly think that you are the first person I know that is expressing the kind of appreciation for the special role which IT can (no, will) play in reshaping the way we can symbolize basic concepts to elevate further the power that conditioned humans can derive from their genetic sensory, perceptual and cognitive capabilities.

Douglas Engelbart

And finally, Doug after look at ‘Hyperwords’ the system I developed at the time, a forerunner of Liquid:

Doug Engelbart’s official website is dougengelbart.org

VR/AR/Extended Reality

We call it by many names; VR, AR and XR, but I think it will soon be referred to by the general public simply as putting on a headset. This is similar to how we used to work ‘with hypertext systems’ but now people just ‘go online’ and ‘click on links’.

I am a firm believer in the coming work style of most of using headsets for at least part of the day, smilier to how we might work on a smartphone, laptop and desktop, and even with our watches, as part of our workday. I don’t think the headset will take over, but it will definitely become a useful part of our work. Since this way of working offers much greater opportunities for information presentation, my own thinking is that this will be the ‘native’ information environment for many people and all the traditional media will be thought of as limited access points.