"What about trace()?". Idea behind the package
Source:vignettes/what-about-trace.Rmd
what-about-trace.Rmd
I. The Story
The Myth
In his talk given at 2016 Shiny Developer Conference, Jonathan McPherson noticed:
If you’re a seasoned R programmer, you may have used the
trace()
function to add tracing without modifying your script. Unfortunately, it’s not possible to use this utility (or any that depend on it, such assetBreakpoint
) with Shiny.trace()
works by rewriting the body of the function to be traced, so the function must already exist when you run it. Shiny generates functions at runtime that aren’t easily addressable.
It was a fragment titled “What about trace()?” and is now available as a official documentation for Shiny1. And we could say it is somehow suspicious, since in the same paragraph it is said that something is impossible, because some other thing is not easily addressable, but not - as we could expect - impossible to be addressable. It’s a good paragraph to start with, even that by saying that everything started, because of this suspicion, would be nothing more than founding myth.
Where Are You, Objects?
Our aim will be the modification of the body of function to insert
browser()
there, but at first we need to find all
functions. We can do this using
lapply(rlang::env_parents(rlang::current_env()), names)
in
the example below:
app.R
library(shiny)
ui <- fluidPage(
modUI("mod"),
textOutput("env")
)
server <- function(input, output, session) {
modServer("mod")
output$env <- renderPrint({
lapply(rlang::env_parents(rlang::current_env()), names)
})
}
shinyApp(ui, server)
R/mod.R
modUI <- function(id) {
ns <- NS(id)
tagList(
numericInput(ns("num"), "Num", 0)
)
}
modServer <- function(id) {
moduleServer(
id,
function(input, output, session) {
}
)
}
The code above iterates over the environments, starting from the current environment, and move through the parent environments, displaying objects and environments as well.
By running this, we will learn that the structure of objects in the Shiny app have the following characteristic:
- on the top (or: as the parent) is
global
environment, but in our example it doesn’t contain any objects created by us - next, going from the top to the bottom (i.e. to the first child) we
will find the objects from our module (saved in R/mod.R file) -
UI
andserver
part (but named asmodUI
andmodServer
) - finally, as the grandchild we see the objects from the file where we
have started -
server
andui
from app.R file (we can ignore the rest of the objects and environments) - some environments are named (like global environment) and some of them are not
Graphically this structure looks like this:
"global": .Random.seed
└── "": modServer, modUI
└── "": server, ui
└── "": input, output, session
and so on...
This is, of course, the evidence that documentation was right:
all R code in a Shiny app is run in the global environment or a child of it2
Now, because we were able to find the objects (and our search was very precise - we also get the environments where these objects live), we can try to modify them.
And Where Are Your Elements?
We want to modify body of function and we will use
body()
for this. Not trying to be precise, but just useful
for our purposes, we can say that body()
returns
very complex list (although it is not of type
list
, but language
).
server <- function(input, output) {
shiny::observe({
x <- 2
x
})
}
body(server)
#> {
#> shiny::observe({
#> x <- 2
#> x
#> })
#> }
body(server)[[2]][[2]][[2]]
#> x <- 2
body(server)[[2]][[2]][[2]][[1]]
#> `<-`
This list can be manipulated - elements can be added, removed etc. We
just need some additional functions for this, like
as.list()
(so we can use append()
to add
elements to the list), as.call()
(to turn back the list
into language
object) or quote()
to not
evaluate browser()
:
server <- function(input, output) {
shiny::observe({
x <- 2
x
})
}
body(server)[[c(2, 2)]]
#> {
#> x <- 2
#> x
#> }
body(server)[[c(2, 2)]] <- as.call(append(as.list(body(server)[[c(2, 2)]]), quote(browser()), 1))
body(server)[[c(2, 2)]]
#> {
#> browser()
#> x <- 2
#> x
#> }
now the breakpoint is set after the first element of body (and the first element is a bracket). Let’s do the same in the complete Shiny app:
app.R
library(shiny)
library(magrittr)
ui <- fluidPage(
modUI("mod"),
textOutput("env"),
actionButton("browser", "Add browser")
)
server <- function(input, output, session) {
modServer("mod")
output$env <- renderPrint({
mod_env <- environment(modServer) # get the environment where `modServer` is defined
as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]])
})
observe({
mod_env <- environment(modServer)
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
}) %>%
bindEvent(input$browser)
}
shinyApp(ui, server)
R/mod.R
modUI <- function(id) {
ns <- NS(id)
tagList(
numericInput(ns("num"), "Num", 0)
)
}
modServer <- function(id) {
moduleServer(
id,
function(input, output, session) {
observe({
input$num
})
}
)
}
We can notice that when Add browser
button is pushed,
nothing happens. However, it is not true - browser()
is
added, but we didn’t yet run the function. In other words, we have added
browser()
and have not used it yet. We can use another
example to show this clearly:
fun1 <- function() {
x <- 2
x
}
body(fun1) <- as.call(append(as.list(body(fun1)), quote(browser()), 2))
body(fun1)
#> {
#> x <- 2
#> browser()
#> x
#> }
How to know if browser()
really works? It’s a tricky
question, because it suggests that we can know if something will be run
before actually running this. Instead of trying to answer this, let’s
just run the function: fun1()
. We will end up in the debug
mode.
We can achieve the same effect in Shiny app by refreshing the session
- by hand, clicking on the refresh button in web browser or by adding
getDefaultReactiveDomain()$reload()
to reload the session
(i.e. by using reload()
method on session object, so it
could be also session$reload()
):
app.R
library(shiny)
library(magrittr)
ui <- fluidPage(
modUI("mod"),
textOutput("env"),
actionButton("browser", "Add browser")
)
server <- function(input, output, session) {
modServer("mod")
output$env <- renderPrint({
mod_env <- environment(modServer)
as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]])
})
observe({
mod_env <- environment(modServer)
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
getDefaultReactiveDomain()$reload() # added
}) %>%
bindEvent(input$browser)
}
shinyApp(ui, server)
Doing this, we re-run the modServer()
function.
Find Them Again (And Again)
Among other (not yet mentioned) problems, we still need a tools to
find where to insert browser()
, because until now
we have just used our eyes. This is somehow secondary issue as we mainly
would like to show in this vignette the core concept of setting
breakpoint, even if by hand, so solutions below are rather only
signaled.
Having a (named) function, which is in a saved file (which was
source()
d, so the srcref
attribute is
present), we can use the following functions:
-
utils::getParseData()
to get thedata.frame
needed to reconstruct the whole script in this saved file (utils::getParseText()
can be useful in this context as well) -
utils::getSrcFilename()
to get the name of the file -
utils::findLineNum()
to get the (1) name of the object; (2) environment where this object lives; (3) location in the body of this object based on line number in the script (i.e. which indices to use to get the right location when usingbody()[[]]
)
However, we quickly find out that after modifying the body of
function, we broke it up - utils::findLineNum()
no longer
returns the correct indices (at
element of returned list is
broken):
app.R
library(shiny)
library(magrittr)
ui <- fluidPage(
modUI("mod"),
textOutput("env"),
actionButton("browser", "Add browser")
)
server <- function(input, output, session) {
modServer("mod")
output$env <- renderPrint({
findLineNum("R/mod.R", 13)[[1]]$at # check how this changed after the first use of button 'Add browser'
})
observe({
mod_env <- environment(modServer)
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
getDefaultReactiveDomain()$reload()
}) %>%
bindEvent(input$browser)
}
shinyApp(ui, server)
R/mod.R
modUI <- function(id) {
ns <- NS(id)
tagList(
numericInput(ns("num"), "Num", 0)
)
}
modServer <- function(id) {
moduleServer(
id,
function(input, output, session) {
observe({
input$num
})
}
)
}
In the example above, we display the at
element of
returned list - when the Add browser
button is pushed by
the first time (and after that the debug mode is closed by pressing
c
or f
), we can see that the displayed indices
changed and we no longer can use this element to find the correct
location in the body of function. In other words, our code now works
just one time - after the first use, the app should be closed and run
again (if at
would be use to determine the location of
body, of course; in the example above we are still using hard-coded
location). To resolve this problem, we need to retrieve the original
body of object (function) - one way to do this would be to use
parse()
. We already know the name of file (and full path as
well) from which comes the function, so it is possible to parse the file
and assign the original body of function to the broken one, before the
browser()
is added.
And Forget Them
All of our actions taken until now led us to the state of greed; we
find objects, modify them and never undo the changes; and then we modify
them again, again, adding more browser()
s. This can be seen
in our previous example - each time the Add browser
button
is pushed, next browser()
is added at the top of the
previous one. Now this greed leads us into trouble:
- we need to escape (using
c
orf
) from debug mode multiple times before we escape for real - eachbrowser()
nests us into debug mode - but what is much worse - the
browser()
persists and each time theobserve()
runs (and it runs every time theinput$num
changes), we end up in the debug mode
The remedy for this would be to remove the browser()
after each use, i.e. after pressing c
or f
(after closing) in the debug mode. But it must be done within
the function which is debugged (because of the session reload). In other
words, the function will remove its own elements. Here is an
exemplification of this mechanism:
fun2 <- function() {
2
env <- environment(fun2) # get the environment where `fun2` is defined
body(env$fun2) <- body(env$fun2)[-c(3, 4)]
}
body(fun2)
#> {
#> 2
#> env <- environment(fun2)
#> body(env$fun2) <- body(env$fun2)[-c(3, 4)]
#> }
fun2()
body(fun2)
#> {
#> 2
#> }
We can see above how to include the code which removes itself; this
code can be added to the function using other function as well (in the
example above we have just added it at the beginning, by hand). In case
of Shiny app, we need to remember that another session reload will be
necessary as an equivalent of function call (equivalent of
fun2()
). Below is an example of Shiny app with a one-time
breakpoint.
app.R
library(shiny)
library(magrittr)
ui <- fluidPage(
modUI("mod"),
actionButton("browser", "Add browser"),
)
server <- function(input, output, session) {
modServer("mod")
observe({
mod_env <- environment(modServer)
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(env <- environment(modServer)), 2))
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(body(env$modServer)[[2]][[3]][[3]][[2]][[2]] <- body(env$modServer)[[2]][[3]][[3]][[2]][[2]][-c(2, 3, 4, 5)]), 3))
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(getDefaultReactiveDomain()$reload()), 4))
getDefaultReactiveDomain()$reload()
}) %>%
bindEvent(input$browser)
}
shinyApp(ui, server)
R/mod.R
modUI <- function(id) {
ns <- NS(id)
tagList(
numericInput(ns("num"), "Num", 0)
)
}
modServer <- function(id) {
moduleServer(
id,
function(input, output, session) {
observe({
input$num
})
}
)
}
This example looks quite complex now, so it may be useful to describe (even if repeating something) crucial steps:
-
mod_env <- environment(modServer)
- we are just getting the environment wheremodServer()
function is defined - it will be later explained, why we use this -
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
- browser is added -
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(env <- environment(modServer)), 2))
- to the function we are adding the code responsible to get the environment where this function is defined (i.e. function in which we are setting the breakpoint) -
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(body(env$modServer)[[2]][[3]][[3]][[2]][[2]] <- body(env$modServer)[[2]][[3]][[3]][[2]][[2]][-c(2, 3, 4, 5)]), 3))
- we are adding to the function the code responsible to remove the added code from the function -
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(getDefaultReactiveDomain()$reload()), 4))
- we are adding the session reload to the function (session will be refreshed after closing debug mode) -
getDefaultReactiveDomain()$reload()
- we are reloading the session (after breakpoint is set)
The very interesting and useful thing is that the line of code can remove the next line of code, but at the same time this next line of code can still be executed - we are using this to reload the session after the added code was removed and to remove the code responsible for this reload. Perhaps it will be clearer to see it, taking as an example regular function:
fun2 <- function() {
2
env <- environment(fun2)
body(env$fun2) <- body(env$fun2)[-c(3, 4, 5)]
"two"
}
body(fun2)
#> {
#> 2
#> env <- environment(fun2)
#> body(env$fun2) <- body(env$fun2)[-c(3, 4, 5)]
#> "two"
#> }
fun2()
#> [1] "two"
body(fun2)
#> {
#> 2
#> }
When we run the function fun2
first time, the last
element (string "two"
), which we are removing, is returned,
so when run first time, the function will execute all lines of code, but
after this we can check that the body has changed and if we would run
the function again, only the number 2
would be returned.
This is very important, because it means that we can add the session
reload to the function and this session reload will be run (so the
changes can be applied) and then removed from the function.
It should be noted here, that even if it looks like we are retrieving
the original body of function (because added elements has been removed),
this is not an answer for the (mentioned earlier) problem with element
at
in list returned by utils::findLineNum()
function - at
will still contain the wrong indices.
“Where Are You, Objects?”
There is one last thing to talk about - getting the (right)
environment. In all examples of Shiny app presented until now, we have
used environment()
to get the environment of object
(function), but we have did it also in the example of regular function
to remove (within the function) elements of this function. These
examples personify two aspects important for us- why it is necessary to
get the environment where object is defined and why it is necessary to
be precise.
The code below should help us answer, why we need the environment:
a <- 4
fun2 <- function() {
a <- 2
}
fun2()
a
#> [1] 4
a <- 4
fun2 <- function() {
env <- .GlobalEnv
env$a <- 2
}
fun2()
a
#> [1] 2
If we stop thinking about this problem in terms of self-referencing,
it looks simple - we can’t modify the object defined in the parent from
the child environment, because of the isolation (function closures the
environment), however we can do this if we explicitly refer to the
environment where the object if defined. Now we just need to remind
ourselves that the function is defined in different environment than
environment where exist function’s elements- is defined in the parent
environment. Thus to modify the function (from that function) we need to
refer explicitly to the environment where this function is defined -
this is not different from our example with the a
variable
above.
That was the answer, why we needed to refer to the environment inside
the function, but why we needed this also in the observe
at
the beginning, where we set the breakpoint? On the one hand, it was
indeed unnecessary - we could easily refer to the object directly, but
(on the other hand) this is not safe - what if there would be a
functions having the same name, but in different environments? How to
refer to this function, not that function?
environment()
returns the environment for the
nearest function and it may be not enough for our purposes.
env_inner <- ""
env_outer <- ""
fun3 <- function() {
fun3 <- function() {
fun3 <- function() {
env_inner <<- environment(fun3)
}
fun3()
}
fun3()
env_outer <<- environment(fun3)
}
fun3()
env_inner
#> <environment: 0x0000000022e49560>
env_outer
#> <environment: 0x0000000022e49678>
We see three functions, all of them have the same name and all of
them live in different environment. We already know that the function we
want to refer to is in an upper environment (because we are inside this
function), but if we are not sure if it is a direct parent, we can end
up having the reference to the wrong object, because - as already said -
environment()
gives us environment for the nearest object
of chosen name.
In our previous examples we have used environment()
, but
that was just for simplicity - we see now that it wasn’t a great idea.
shinybreakpoint
uses something different - function which
searches through the environments to get the needed environment.
To be able to refer precise to given function, it is needed to know
the environment. Relatively easy would be to connect object with
environment upfront, when searching for objects across the environments
using code mentioned at the beginning,
i.e. lapply(rlang::env_parents(rlang::current_env()), names)
.
We could at first use
rlang::env_parents(rlang::current_env())
to get the list of
environments and then get the objects as a list, using
lapply()
and names()
. This way we would have
two lists of the same length and every element will point to
the same environment, but the problem will occur when trying to use this
environment in the code added along with browser()
. Coming
back to the Shiny app, we have used this line of code before:
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(env <- environment(modServer)), 2))
We are constructing the env
variable which is an
environment, but how to know if this is the environment which contains
modServer
which we are looking for and not the other
function of the same name, but in the different environment (like showed
before)? So we know that it is possible to have the list of environments
and objects (functions) which belong to the given environment, but how
to use this knowledge? This solution below won’t work:
fun3 <- function() {
3
}
env <- environment(fun3) # global environment
call("assign", "envir", env)
#> assign("envir", <environment>)
# assign("envir", <environment>)
# Error: unexpected '<' in "assign("envir", <"
We have constructed the call (which is an assignment), using the
variable env
which is an environment where
fun3
is defined, but this call is not correct - we see that
it is impossible to treat the environment like regular value - we can’t
simply evaluate the variable to get the value.
But the idea is not bad - we already have the object
(fun3
) and the environment (env
), so we could
just add to the body of function the code which will bind environment to
the name envir
. This way we would have the correct
environment to refer to it, but as we see, instead of get something like
assign("envir", .GlobalEnv)
, we have the incorrect syntax:
assign("envir", <environment>)
. We can, however, keep
the label of environment we need and search for the correct environment
using this label. Label is the name of environment (if named) or address
in memory - to get the label, we use
rlang::env_label()
.
fun3 <- function() {
3
}
env_lab <- rlang::env_label(environment(fun3))
get_envir <- function(label) {
envirs <- rlang::env_parents(rlang::current_env())
names(envirs) <- lapply(envirs, rlang::env_label)
envirs[[label]]
}
call("assign", "envir", call("get_envir", env_lab))
#> assign("envir", get_envir("global"))
The expression assign("envir", get_envir("global"))
can
now be added along with browser()
to get the right
environment - we know that it is (in the example above)
"global"
environment and our search (function
get_envir()
) will give us precise environment - the labels
are unique. Of course, we still should think how to make sure that
get_envir()
will be visible in the environment (function)
where this code will be added. In case of Shiny app it can be stored in
global environment (all objects from global environment are visible in
any environment in the app, because it is a top environment). In case of
shinybreakpoint
, because this function lives in the
{shnybreakpoint}
environment (namespace), access to this
function is not a problem as well (if package is installed).
II. Closure
Setting breakpoint is a technique which is based on source code understanding as a commands executed line-by-line. This is how the source code is visible by humans - it is stored like a text and we all used to read the text knowing that the next statement is on the left, right, down. Functions make it a little more complicated - function definition can be far, far away from the line where this function is called, so to know, why we received the results we see, it may be necessary to jump in the different place in the file to check the function body (or set breakpoint). Reactive programming makes it even more complicated - it is not so easy to know, which code block has been executed when some input changed.
The idea of storing functions, grouped functions or modules in separate files is a try to give a source code more intuitive structure and this is a point where difference between text being a story and text being a code becomes tangible. We could even start thinking that affinity of code to the text is a ballast. Isn’t true that we split source code into multiple files because we want to be able to read it as a text (story) as easily as possible and not as a fragmented text (story)? And don’t we feel that setting breakpoint is now the same as selecting a text fragment in a book even if we are not writing or reading a book? Of course, someone could say that we can use digital tools to search a text fragment in a source code or a book, but there is one more important difference between story and a code - story is executed in our brains, but code is not. Do we really need to expose ourselves on all of this source code written as text, on each line? And even if in some smaller file - do we still need to always see all of these characters?
This last chapter is not a summary, but is about the idea that perhaps it would be useful to not display the source code, but relations between code blocks which belong to the reactive context to minimize the exposition of developer to the text as far as possible; and using the fact that code was executed in the machine, not brain, it could be useful to display the last relation, so the breakpoint can be set somewhere in the chain relevant for output or side effect. All of this is of course not new, but because is still a matter of future, should be reminded in this document.