CoCalc -- 2022-06-27-ws.board

CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!

GitHub Repository: williamstein/scratch
Path: blob/main/2022-06-27-ws.board
Views: ⁸⁵

Monday

sandbox todo sage worksheets are totally broken!
 probably also broken in cocalc-docker, so update those images too
 #now make it so a click is needed to active the sandbox or cocalc embed in all cases (we can preload the app, but only open the project it is pointed out when a link is clicked)? or do nothing.
 add the sandbox at the top of all the landing pages 
 add a little text about what the sandbox is.
 re-enable when the above is done.

webassembly python get lvma to build for wasm

Tuesday

 Jupyter timeout issues ticket

 Try to reproduce in dev project.
I absolutely cannot reproduce this.  No clue.
I did make some random hopefully improvements.
I also reuseinflighted all api calls.

 Meetings ipywidgets at 9:30am
 vantage at 10am
 startup guys from USC at 1pm

 Sandbox

 reuse in flight the project configuration data for a while

 put a configurable hard limit (e.g., 100?) on the number of users that can be added to the sandbox project, and change the timeout to 5 minutes.    This should probably be in the configuration in the project.
Delaying this to see what we can handle... WHEN I'm not AFK

 Nextjs upgradehttps://nextjs.org/blog/next\-12\-2
 upgraded code
 build and release

 switch to make share search by stars first, then timestamp
 also fix bug in rendering this notebook: https://cocalc.com/share/public_paths/dc5b8a2570b3ffe695b57982c397aefb26d8faf9

Wednesday

side/personal things "paper with Lum" / Ohana thesis -- https://mail.google.com/mail/u/0/#inbox/FMfcgzGpGdjMNBVpjTsfFxVrRLkbmHZB 

 wapython

get wasm python build past the "no emscripten.h" step.

 Ipywidgets message and buffer supporthttps://github.com/sagemathinc/cocalc/pull/6011 
created this during ipywidgets dev meeting -- https://github.com/sagemathinc/cocalc/issues/6022

 Watch buttonfor shared files: along with explicit updates only for shares (i.e., you click a button when you want the shared content copied to the share server, and also optionally enter a commit message)

 github url proxying-- https://github.com/sagemathinc/cocalc/issues/6015 
motivating example: https://sagemanifolds.obspm.fr/intro_to_manifolds.html 

Updates update frontend app in prod, since it is slightly old and has a bug I hit as admin looking at project settings (that I fixed days ago in the code).
 update cocalc-docker on arm and x86_64

So... how would this work?  Maybe that's all I should do today is think about this?
A big difference from nbviewer is that there is the possibility of sharing (and soon watching).  Thus we need a database entry, i.e., a public path.  It would also be nice to make things fully work, e.g., if the notebook links to data in the same repo, it does work.
To do this, we could make it so when you visit such a URL, it clones or updates the repo to files on the share server, and then just uses that. What does nbviewer do?

Idea - clone to projectWe could create a special cocalc project that is for the purposes of proxying github.  it has some id and in admin settings you enter that in a box.
When a request for github comes in to the share server, we use slightly different logic to handle it.  That involves:
check if clone exists -- if so, serve it, but also fire off an update (git pull) in the project itself
if clone doesn't exist, start project, then exec code to clone from github.
This is nice since it would work the same way in dev/docker/cocalc.com
This is very bad, since it would be easy to hit the quota for a project, or basically make one project get really large.  Also, in kucalc, the time to go is (1) clone to project, then (2) rsync from project to share server.  That's a major increase in the time, and waste of space.

Idea - clone to share server directlyWe still have a project dedicated to this
But in kucalc it's just a placeholder to get the project_id so know where to put things in share server
(In cc-in-cc-dev and cocalc-docker, files do end up in project, since things are the same.)
Cloning is actually done by hub-share in kucalc.  in cocalc-docker, same code is run directly by share server; will need logic to accomplish this and it'll be a pain to develop.  But doable and probably not too hard.  It means hub-share needs git installed.
Update/clone:  (1) first try git pull , (2) if fails, switch to rm then git clone .
I wonder if trying to use git at all is a terrible idea?  The repo could be big... and maybe we just want to work with a single file easily.

Pure memory versionwhen request comes to share server, we find the corresponding raw url (e.g., with a simple transform)
we fetch that raw content into memory, with some limit on size, e.g., get the first xxxMB and never more.
render it.
also do the other things, e.g., directory listing, projects for org, etc., like nbviewer does.
the edit button in cocalc does something different in the case of github, namely it starts or creates a projects, then runs git clone in that project and finally opens the relevant file.
contributing back to upstream as a PR would then be something we can implement later.
If the notebook refers to a file it still could be possible to work via similar fetch, etc., on that file.
An advantage to all this is that it will be identical in cocalc-docker, dev, kucalc, etc.  It also will be optimal in speed, wastes no space, etc.  This is clearly the right solution.
I'm going to implement this today.

Rule: we will proxy github url's if and only if there is an organization with the name github.  Obviously, only an admin can create such an organization.
Definition: for github url, the public_path_id is sha1 hash of organization id and rest of the github path.  Record gets automatically created in the database whenever requested.
 in https://cocalc.com/projects/10f0e544-313c-4efe-8718-2142ac97ad11/files/cocalc/src/packages/next/lib/names/public-path.ts make it so it has a special case when the owner is github to use that organization_id always as the project_id.

I'm causing myself a lot of confusion by trying to design something that will solve a very general problem, e.g., proxying for CUP via a non-github approach.  I could implement the most straightforward approach to just github as a special case, then once it works and I get experience, rewrite it to be general if there is design. 

Plan to implement GitHub proxy:https://github.com/sagemathinc/cocalc/pull/6026 
This is hopefully a "less than one day" project.
 make it so in site settings admin can set a github proxy project_id;   github_proxy_project_id and githubProxyProjectId
also add githubProxyProjectId to the customize stuff.
 in https://cocalc.com/projects/10f0e544-313c-4efe-8718-2142ac97ad11/files/cocalc/src/packages/next/lib/names/public-path.ts make it so it has a special case when the owner is github to use that project_id always, and automatically create the public_path_id record.
 #next when grabbing content to render, check if the project_id is githubProxyProjectId and if so, get via fetch.
 when copying content to file to edit,  check if the project_id is githubProxyProjectId and if so, run a git clone command in the user's project instead of copying from the proxy project.
 add link to upstream github repo at top.
With this, staring will still work, though we'll need to special case watch.  Also, the target will appear in the list of shared files, which is actually pretty cool, and will provide potentially massive value regarding SEO that we just wouldn't get otherwise.

CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!

Monday

sandbox todo

webassembly python

Tuesday

Jupyter timeout issues ticket

Meetings

Sandbox

Nextjs upgrade

Wednesday

side/personal things

wapython

Ipywidgets message and buffer support

Watch button

github url proxying

Updates

Idea - clone to project

Pure memory version

Plan to implement GitHub proxy:

Product

Resources

Company